



FOREWORD 


The Software Engineering Laboratory (SEL) is an organization 
sponsored by the National Aeronautics and Space Administra- 
tion/Goddard Space Flight Center (NASA/GSFC) and created for 
the purpose of investigating the effectiveness of software 
engineering technologies when applied to the development of 
applications software. The SEL was created in 1977 and has 
three primary organizational members: 

NASA/GSFC (Systems Development Branch) 

The University of Maryland (Computer Sciences Department) 

Computer Sciences Corporation (Systems Development 

Operation) 

The goals of the SEL are (1) to understand the software 
development process in the GSFC environment; (2) to measure 
the effect of various methodologies, tools, and models on 
this process; and (3) to identify and then to apply suc- 
cessful development practices. The activities, findings, 
and recommendations of the SEL are recorded in the Software 
Engineering Laboratory Series, a continuing series of reports 
that includes this document. The papers contained in this 
document appeared previously as indicated in each section. 

Single copies of this document can be obtained by writing to 

Systems Development Branch 

Code 552 

NASA/GSFC 

Greenbelt, Maryland 20771 
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SECTION 1 - INTRODUCTION 


This document is a collection of selected technical papers 
produced by participants in the Software Engineering 
Laboratory (SEL) during the period November 1989, through 
October 1990. The purpose of the document is to make 
available, in one reference, some results of SEL research 
that originally appeared in a number of different forums. 
This is the eighth such volume of technical papers produced 
by the SEL. Although these papers cover several topics 
related to software engineering, they do not encompass the 
entire scope of SEL activities and interests. Additional 
information about the SEL and its research efforts may be 
obtained from the sources listed in the bibliography at the 
end of this document. 

For the convenience of this presentation, the seven papers 
contained here are grouped into four major categories: 

• Software Measurement Studies 

• Software Models Studies 

• Software Tools Studies 

• Ada Technology Studies 

The first category presents experimental research and 
evaluation of software measurement; the second category 
pr 0 S©nts studies on models for software reuse; the third 
category presents a software tool evaluation; the last 
category represents Ada technology and includes studies in 
the areas of reuse and specifications. 

The SEL is actively working to understand and improve the 
software development process at Goddard Space Flight Center 
(GSFC) . Future efforts will be documented in additional 
volumes of the Collected Software Engineering Pap ers and 
other SEL publications. 
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SKrTTOH 2 - SOFTWARE MKA^IJRKMEHT STUDIES 

The technical paper included in this section was originally 
prepared as indicated below. 

• "Design Measurement: Some Lessons Learned," 

H. Rombach, IEEE Software , March 1990 
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Design Measurement; 
Some Lessons Learned 


§L Dhimr Romimch, University of Maryiand at CoUege Park 


lion 10 uadidonai code-based measures 
because it lets you capture important as- 
pects of the product and the process ear- 
lier in the life cycle so you can takecorrec- 
dve acuons earlier. In turn, this benefit 
leads to a potendaily high payoff, since we 
know that errors are more costiv if com- 
mitted early in the life cycle and not 
caught undl much later. 

In this article. I extract from several 
measurement projects some of the impor- 
tant lessons I have learned about mea- 
surement in general and design measure- 
ment in pardcular. I have synthesized 
these lessons into a design-measurement 
framework in an attempt to communicate 
my personal measurement experience to 
other software engineers. 

My general measurement experience 
was gained on the Distos/Incas^ project at 
the University of Kaiserslautern. West 
Germany; several projects at the National 
Aeronaudes and Space .Administrauon’s 
Software Engineering Laboratory at the 


Most software 
measurements are 
derived from source 
code. A proausittg 
atkBtion to the HeU is 
design measurement^ 
which applies 
measurement 
prbidples to front-end 
phKluetsand 
processes. 


I March 1990 


M easurement is becoming recog- 
nized as a useful way to soundly 
plan and control the execution 
of software projects. However, current 
measurement praedees are deficient in 
four ways; 

• They emphasize the back end of devel- 
opment. mainly the coding and testing 
phases; 

• they are biased toward software prod- 
ucts. as opposed to processes; 

• they are based on unsound measure- 
ment methodologies; and 
• they are not integrated with develop- 
ment aedvides. ..... 

In short, most software measurements 
are derived solely ftrom source code. De- 
sign measurement — as Figure 1 illus- 
trates — is the appUcadon of measure- 
ment to design procasa (the word I use to 
refer to all k^ds of aedvides) and/or the 
resulting design products ( the word I use to 
refer to all kinds of documents) . 

Design measurement is a desirable addi- 

0740.7459^90030000 17/SOI .00 e 1990 IEEE 
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» Products/docuineiTts ‘^•^^Procasses/rrwmods/tooti 

Uf»>cyde objects messureO 


Key: 

rn Current emphasis of measuremem 
Rg Proposed addition to measurement 


Figura 1. The scope of measurement 


Goddard Space Flight Center in Green* 
belt, Md.; and the Tailoring a Measure- 
ment Environment project at the Univer- 
hey of Maryland. 

My design-measurement experience 
was gained on the Discos/Incas project, 
which measured intercomponeni de* 
pendencies to predict future mainte- 
nance behavior a study at the Univerhcy 
of Maryland that compared the effect of 
various design methods; and various stud- 
ies at the Software Engineering Labora* 
tory to develop quandiatxve design base- 
lines. 

Measurement 

From each of these projects, I learned 
important lessons about “effeedve* soft- 
ware measurement- These lessons tended 
to fall in three areas; 

• how measurement must be applied in 
individual experiments or case studies, 

• how measurement can help continu- 
ously improve an organization's state of 
the practice, and 

• why measurement requires automated 
support. 

Experiments one studies. As part of 

the Oistos/Incas project on disoributed 
sysums, my colleagues and I developed 
the object-oriented language Lady.^ One 
objective of this language was to improve 
the maintainabiUty of the distributed soft- 
ware written in iL The funding agency, the 
German Ministry for Research and Tech- 
nology, requested empirical evidence of 
whether (and to what degree) this objec- 
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dve had been met To find out, we con- 
ducted a large, controlled experiment in 
which we developed 12 systems, six in 
Lady and six in a tradidonal procedural 
language. We then studied their conse- 
quent maintenance.’ 

This experiment has taught us seven les- 
sons: 

• There are many types of measurement 
goals. Measurement goals can differ in the 
type of object the measurement focuses 
on, in their intended effect on the object, 
and in the people interested in them. A 
measurement goal may focus on ol^ect 
types such as processes, products, lan- 
guages, methods, and cools. Its intended 
effe« is either passive (when you want to 
understand the ol^ect) or active (when 
you want to predict, concroL and improve 
the object) . The people interested range 
from language and tool developers to 
customers and users. 

In the Distos/Incas experiment, we had 
two main goals. First, we wanted to deter- 
mine and explain differences in the main- 
tenance behavior of systems implemented 
in Lady and those implemented in the tra- 
ditional language. Second, we wanted to 
predict the maintenance behavior of Lady 
systems based on structural complexity. 

The first goal focused on the languages 
used; its intended effect was passive be- 
cause it was meant to help us underscand 
Lady’s effect on maintainabiUty; and it re- 
fiected the interest of the language de- 
signers. The second goal focused on the 
product and maintenance process; its in- 
tended effect was active because it was 


meant to help us guide and control the 
appropriate use of Lady to build main- 
tainable systems; and it reflected the inter- 
est of the managers and developers plan- 
ning to use Lady. 

• Models and measures are inseparable. 
Measures are intended to characterize 
some aspect of a sofhvare objea in quantx- 
cative terms, but different models of the 
same aspect are possible. Without an ex- 
pUdt spedficadon of the chosen model, it 
is impossible to judge the approp riatenes s 
of the quandtadve measures selected. 

In the Distos/Incas experiment, the 
maintainability model was based on the 
cost required to perform a change during 
maintenance and the effect of the change 
on the maintained product. With this 
model, measures like ‘efifort in staff hours 
to perform a change" and "number of 
modules affected by a change" were jusri- 
fied. To predict maintenance behavior 
based on smocniral complexity, we chose 
Sallie Henry and Dennis Kafuia’s model 
for information flow between compo- 
nents.^ In this model, measures such as 
"number of incoming informadon Hows 
per module " and “number of outgoing in- 
fonnarion flows per modtiic" were jusd- 
fied. 

• Vbu need different types of measures. 
We learned that you need both abstract 
and 5 ptd^ measures, process and product 
measures, dneetand indriect measures, and 
tiMRXRwand xu^irctxur measures. 

Most measures reponed in the Uterar 
cure are based on some abstract model 
(for e.xample, control-flow measures 
based on abscraa program graphs) . Such 
abstract measures must be tailored to the 
specific charaaerisdes of the object to be 
measured (for example. Ada control-flow 
mea^ires must be based on Ada's specific 
control-flow features) . 

Produa measures (such as design com- 
plcxicv) are not sufficient to support ac- 
tive measurement goals. Planned im- 
provement of quality and producdvicy is 
only possible through measurably im- 
proved (such as fewer design errors) de- 
velopment processes. 

Direct measures are intended to quan- 
tify some quality aspect (the number of 
staff hours spent on design is a direct mea- 
sure of design cost, for example) ; indirect 
measures of some quality aspect are in- 
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tended to predict this quality based on 
other inforxnadon that can be derived ear- 
lier (for example, the number of product 
requirements may be an indirect measure 
to predict the number of staff hours you 
expect to be spent on design) . 

Objective measures (such as lines of 
code) are defined well enough so that two 
people should compute the identical 
value from the same object indepen- 
dently. Subjective measures (such as staff 
experience) are computed based on a 
subjective esamadon or a compromise 
among a group of people. Ol^ective me» 
sures are easier to automate than subjec- 
dve measures. 

In the Distos/ Incas experiment, the 
measure ‘number of incoming infomur 
don flows per module* is an abstract mea- 
sure. To collect this measure, we had to 
determine how ‘Incoming information 
flow* could be measured from p r ogr a ms 
implemented in Lady. The maintenance- 
effort measures are process measures; the 
structuial<omplexicy measures are prod- 
uct measures. The maintenance-effort 
measures are direct measures of cost; the 
stnicnualcomplexity measures are direct 
measures of pr^uct complexity and indi- 
rect measures of maintenance cost and ef- 
fect ~ they do not direcdy characterize 
maintenance cost and effect but are ex- 
pected to help predia them. 

• Measurement-based analysis results 
are only as good as the data they are based 
on. It is important to recognize the limits 
of interpreting measures depending on 
their scale (that is, nominal, ordinal, in- 
terval, or ratio) and the validity of the un- 
derlying data. Validating data is a very 
time-coruuming and often underesti- 
mated task. However, the sensitive task of 
interpreting daa becomes guesswork if 
you try to use inappropriate interpreta- 
tions or foil CO consider the validity of the 
underlying data. 

In the Distos/Incas experiment, we 
used the complexity measures only as or- 
dinal measures because we felt chat they 
could predict that a more complex Lady 
system would require more effort per 
maintenance change, but not how much 
more. About half my time on the Dis- 
tos/Incas experiment was spent on data 
validation. 

• You need a sound experimental ap- 
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proach. A measurement-based experi- 
ment requires extensive planning, te- 
dious data collection and validation, and 
careful interpretation of the collected 
data. As in other experimental disdplines, 
you need a formal approach to expen- 
mencadon. 

In the Distos/Incas experiment, we for- 
mulated an approach for the experimen- 
tal validation of structural-complexity 
measures. Our approach has six steps: 

1. Model the quality of interest (main- 
tainabilicy, in this case) and quantify it into 
direct measures. 

2. Model the product complexity in a 
way chat lees you identify ail the aspects 
chat may affect the quality of imeresL 


AmmatumneaHmaed 
axpwiifMnt roquirma 
mxtanaivB plamtittg, 
tedSoum data col l a cti on 
and validation^ and 
earafut btiatpra tat km. 


3. Explicitly state your hypotheses 
about the effect of produa complexity on 
the quality of inceresL 

4. Plan and perform an appropriate ex- 
periment or case study, including the col- 
lection and validation of the prescribed 
data. 

5 . Analyze the data and validate the 
hypotheses. 

6. Assess the just-completed expexi- 
memal validation and, if necessary, pre- 
pare for future experimental validations 
by refining the quality and product-com- 
plexicy models, your hypotheses, the ex- 
periment itself, and the procedures used 
for data collection, validation, and analy- 
sis. In away, this step isa built-in validation 
and improvement of the experimental 
validation itself 

• You must report specific measurement 
results in contexL It is not useful to report 
measurement results from an experiment 
or case study without carefully characteriz- 
ing the study's context. The way you pre- 
sent your results should put the reader in 
a position CO repeat the experiment or 
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case study. Only then can the reader agree 
or disagree with your conclusions. It is not 
only useless to present results out of con- 
text. it is also dangerous, because it may 
lead to inappropriate perceptions. 

I have published some of the results of 
the Distos/Incas experiment together 
with the necessary context information.^ 
The results suggest that Lady programs 
are more maintainable chan traditional- 
language programs and that stnicniral 
complexity is a useful predictor for a 
component’s maintainabilicy. 

The advantage of providing the expexi- 
mental context is that readers can agree 
or disagree with the experimental ap- 
proach chosen. For readers who disagree 
with the approach, the results have no 
value; for readers who agree with the ap- 
proach, the results may confirm or add to 
their current understanding. In the Dis- 
tos/Incas experiment, we found chat 
structural complexity cannot be com- 
pared across language boundaries based 
on the suggested language-specific com- 
plcxity measures. However, the proposed 
abstract complexity measures seem ap- 
propriate. 

* You must assess each experimental val- 
idation icsei£ It is important to transfer 
knowledge gained from one experiment 
to the next. This lets you state beaer goals, 
use better measures, and interpret the re- 
sults in a broader contexL 

In the Distos/Incas experiment, this as- 
sessment was explidtiy integrated, as step 
6, into our experimental approach. As a 
consequence of this postmortem assess- 
ment, we posed many new questions, 
some of which led to the follow-up experi- 
ments I outline later. 

At the Uni- 

vexsicy of Maryland, we have developed a 
general measurement approach called 
the goal /question /metric paradigm.* 
The GQM paradigm is broader in scope 
and formulated in more operational 
terms than the specific expenmentai ap- 
proach applied in the Distos/Incas exper- 
imenc However, both agree on two major 
measurement prindples: First, measure- 
ment must be top-down — measurement 
goads define what measures should be col- 
lected. Second, the data imerprecauion 
must take place in the context of some 
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goal and hypothesis. 

The GQM paradigm has four steps: 

1. State measurement goals in opcrar 

donai terms. You do this step using tem- 
plates, which help you formulate goals 
and refine them into quesdons and mea- 
sures. 

2. Plan measurement procedures to 
support the coUecdon and validadon of 
data needed to compute the measures 
prescribed in step 1. 

3. Collect and validate data. 

4. Analyze and interpret the collected 
Hata and measures in the context of the 
quesdons and goals stated in step 1 . 

We have expanded the GQM paradigm 
into the quaiicy4mprovement paradigm, 
which aims to facilitate condnuous im- 
provement of an organiadon's software- 
engineering pracdces.^ The quality-im- 
provement paradigm embodies three 
basic measurement principles: Rrst, mcar 
surement must be applied conunuously 
to all projects in an oeganizadon. Second, 
measurement must be an integral: part of 
each project — ’‘development* must in- 
clude fo<A software constnicuon and mea- 
surement. Third, the experience gained 
from each project must be recorded in a 
measurement database and be made 
available to future projects. 

The quaiicy-improvemem paradigm has 
six steps: 

1. Characterize the project environ- 
ment. 

2. State improvement goals in opera- 
donai terms. Again, this is done through 
templates that help you formulate goals 
and refine them into quesdons and mea- 
sures. 

3. Plan the project (by selecdng appro- 
priate methods and tools) and the mea- 
surement procedures to support the col- 
lection and validadon of data prescribed 
in step 2. 

4. Perform the project and the dau coi- 
lecuon and validadon. 

5. Analyze and interpret the collected 
data in the context of the quesdons and 
improvement goals sated in step 2. 

6. Return to step I armed with the ex- 
perience gained from this project- 

Applying the quaUcy4mprovcment par^ 
adigm at NASA’s Software Engineering 
Laboratory has led to a bro^ bodv of 
measurement experience.^-* At the SEL. 
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the quality-improvement paradigm is now 
an integral part of development (and just 
recently maintenance) activides to iden- 
tify the quality goals of interest, use stan- 
dard measurement procedures to collect 
the necessary daa from ongoing produc- 
don projects, validate and interpret the 
anH main rain a. corporate measure- 
ment database. 

The Goddard Space Flight (Center has 
benefited from this measurement-based 
improvement approach in many ways, 
ranging from a better understanding of 
the weaknesses and strengths of its envi- 
ronment. to better planning, to the devel- 
opment of a new standard set of develop- 
ment methods and tools, to higher 


Introducing 

mmaouromont to improve 
on organization's 
dovoiopnmnt practices 
roquiros fundomontai 
changes of the 
organization. 


producdvicy and the produedon ofhighcr 
quality software. 

My own aedve involvement in the SEL 
has helped me better understand several 
issues related to the incroduedon of mca- 
surement into a produedon environment: 

. • Introducing measurement has far- 
reaching consequences. Introducing 
measurement to improve an organ- 
izadon’s development praedees requires 
fundamental changes of the organizadon. 
It does not just add daa collecdon to the 
exisung development aedvides — it rwaily 
changes the existing development aedvi- 
ues by making them more transparenL 

In addiuon. the effective incorporadon 
of measurement into an organizadon re- 
quires changes in the reward structure so 
it is consistent with the goals motivated by 
measurement and so the addidonal ef- 
forts spent on daa collecdon and vaJida- 
don are rewarded. All in all, measurement 
can reveal the advanages and disadvan- 
tages of current practices and spur 
changes. Inappropriate measures can be 


councerefieedve because they may cause 
the umTngchanges. 

At the SEL each project member fills 
out a daocollecdon form every dme he 
makes a change ^ to capture the nature, 
cost, and effect of that change — and 
weekly to capture the effort spent on 
aedvides and products. Filling out these 
forms has become as routine as writing 
code. Special measurement employees 
validate the collected forms, mainoin the 
measurement daobase. and produce pe- 
riodical reports. 

• "Vbu must justify the cost of mcasure- 

menL Measurement costs! The cost is ac- 
ceptable if it is justified by the expected 
quality and producdvtty improvements. 
Measurement itself can be used to quan- 
tify the improvement potential by captur- 
ing the amount of rework, for example. 
The GQM paradigm itself helps you build 
the that investment in capturing cer- 

tain measures may pay off by tying them to 
an organizadon ’$ obvious improvement 
needs. 

M the SEL. each project spends, on av- 
erage, 3 percent of its budget on data col- 
lecdon and validadon. The organizadon 
Spends an addidonal 4 to 6 percent on off- 
line daa processing and analysis. How- 
ever, you should expect a higher invest- 
ment up front to build a new program. 

• We must address both technology- 
transfer and research issues. The technol- 
ogy to esablish an improvement program 
exists, as the SEL and ocher organizauons 
have shown. Using the available technol- 
ogy is mainly a technology-transfer prob- 
lem. 

However, there are important areas chat 
need more research. These areas include 
the formalizadon of measurement plan- 
ning. support for daa inierpreadon. sup- 
port for learning based on measurement 
results and reusing what has been learned 
across projects and environments, and 
the appropriate automated support forall 
these aedvides, especially the appropriate 
organizadon of corporate measurement 
daabases. 

One of the largest corporate measure- 
ment daabases exists at the SEL. Built 
over the last 12 years, it includes measure- 
ment daa on product charactensdes (size 
and complexiev), process characterisucs 
(effort, changes, and defects), the effec- 
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dveness of methodologies (what types of 
&uits were easily detected using method 
X), and project characteristics {methods 
and tools used, and personnel experi- 
ence).* At first, measurement covered 
only the development stages, but mainte- 
nance has recently been added.^ 

Automated support. Much research re- 
mains to be done to properly integrate 
measurement into software development 
and maintenance and to juovide auto- 
mated support in the form of sofbvare-en- 
gineering environments. 

In the Tailoring a Measurement Envi- 
ronment project at the University of Mary- 
land, we address all these measurement- 
related issues in the context of the 
framework provided by the qualicy-im- 
provement paradigm.* We try to formal- 
ize models and we support characterizing 
corporate environmeno. planning con- 
struction and measurement activides, col- 
lecting, validating, and analyzing data, 
and learning from the measurement re- 
sults to do a better job in the next prqfeo. 
We are developing a series of TAME pro- 
totypes based on an architecture that sup- 
ports ail these activities. 

From the TAME project, we have 
learned thac 

• You need automated support. The 
amount of information accumulated in 
an organization that applies a measure- 
ment-based improvement approach can- 
not be handled manually. Also, without 
automated support, results cannot be 
made available to interested people in 
real time so they can be used to support 
project dedsioru. 

In establishing the SEL program, we ini- 
daily collected data without database sup- 
port. After about six months of collecting 
maintenance data from only two projects, 
we depended on database support to 
maintain control of the data<onection 
process. 

It takes more than just tools to suppon 
the automated collection of product data. 
We also need automated support that 
spans the entire set of measurement activ- 
ities suggested by the quality-improve- 
ment paradigm. In the TAME project, we 
are developing tool support for the for- 
mulation of goals, the derivation of mea- 
sures. the interpretation of data, the re- 
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porting of measurement results, and the 
maintenance of an experience base. 

• You must integrate constnicdon and 
measurement support. Measurement 
processes must be tailored to the con- 
struction processes they are to measure. 
The construction processes, on the other 
hand, must be designed to be measurable 
to the degree necesary. 

Often, measurement is expected to an- 
swer questions about the construction 
process that cannot be answered based on 
the way construcdon is performed. Very 
often, the reason for such inconstscendes 
is that there exists no expUdt agreement 
on how construction is or should be per- 
formed. 


I iOatlnguiah b m t w 9 oa 
two dmaign simps: 
arcMtaetursIf or 
lagMsvsl, dsslfiiand 
r^goriOunlef arknwfov^ 
dssifii. 


It is very hard to tailor measurement to 
heurisdc construcdon processes. To ad- 
dress this problem, we are developing a 
language that lets us model any devdop- 
ment process explidtly and instrument 
that process for measuremenL^ The ex- 
plidc spedficadon of some construction 
process may help clarify what the limica- 
uons for measuring it are and whether the 
need for addidonal measuremenu is ur- 
gent enough to consider changing the 
construction process to make it more 
measurable. 



I disdnguish between two design steps: 
architeaural, or high-level, design and al- 
gorithmic, or low-level, design. Architec- 
tural design involves idendfying software 
components and their imerconnecdon; 
algorithmic design involves identifying 
data structures and the control flow 
within the archttectural components. 

Most design measurement reported in 
the literature measures product complex- 
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icy at the end of the algorithmic design 
phase from program-design-language 
documents. Many of these measures 
(such as Tom McCabe's cydomadc-com- 
plexicy measure) capture product aspects 
equally well from program-design^tan- 
guage documents and source code, so it is 
not surprising that the results derived 
through these design measures do notdif- 
fer from results derived through corre- 
sponding source<ode measures. 

In this article, 1 use the term ‘^design 
measure” to refer to architectural design 
measures. In this context, the measure- 
ment of designs is more complicated be- 
cause typically less informadon is docu- 
mented in a formal, measurable way at 
this eariy sage. 

When you try to measure software de- 
signs, you realize that the potential for 
meastirement is limited by the measur- 
abiltcy of the design documents. There is 
very often a discrepancy between the 
need for measuring a design aspect (such 
as number of separate design decisions) 
and its measurabtlicy or lack thereof 
(many design decisions are documented 
very informally or not at ail) . 

Therefore, design measurement, more 
so than code measurement, can not oniy 
capture design aspects quamitadveiy, but 
it can also drive the development and use 
of more formal, better measurable design 
approaches. The same argument can be 
made in the case of design processes, 
which are typically heuristic rather than 
formally specified. 

Design characterizabon. We need a way 
to charaaerize software designs based on 
architectural design measures. In the Ois- 
tos/Incas experiment, we developed de- 
sign measures acddencally when we tried 
to compare the structural complexity of 
products implemented in languages 
based on different struemrai concepts. To 
do so. we had to resort to comparisons at 
some abstract level. 

We defined an abstraa model that was 
general enough to be instantiated into the 
precise models underlying each lan- 
guage. In that regard, the alntraa struc- 
tural model r ep r e s ented the greatest com- 
mon denominator between the different 
language-specific structural concepts (I 
have described the abstract model's in- 
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scandadons elsewhere^) . 

We then realized that the abstract 
model could be inatandated co mesh 
sure intercomponent compiexiiy during 
design — completely for coilecdon at the 
end of algorithmic design, pardally at the 
end of architectural design. 

From this experience, we have learned: 

• Specific measures derived &om the 
same absoract model can easily be com- 
pared across life-cycle phases. Abstract 
models and measures let you instandaie 
compadble measures to trace some de- 
sign aspea across several life-cycle phases. 
Compadble measures help identify the 
llfe-c^e phases in which the aspect of in- 
terest (in our case, strucniral complexicy) 
is predominantly addressed. 

In the Disios/Incas experiment, we 
measured and traced structural complex- 
ity through several consecutive life-cycle 
phases — from architectural design 
through coding. It became obvious that 
most of the important structural decisions 
had been made irreversibly by the end of 
architectural design. 

• It is difficult to isolate and understand 
the effects of design methods. This is due 

in part to the creadve nature of the design 
process itself and in pan to the heurisac 
and therefore unpredictable (as to their 
e ffe ct) nature of most design methods. 

In the Distos/Incas experiment, we 
were tempted to attribute the observedsu- 
periority of systems implemented in Lady 
to the language’s advanced structural feap 
tures. This seemed to be a valid conclu- 
sion because we had kept all the other po- 
tentially coniribudng factors as constant 
as possible (we had trained students simi- 
la^, used the same design-cool support, 
and so on). 

However. foUowmp interviews led us to 
believe that the major concribucor was the 
object-oriented design approach that we 
had tailored to suppon Lady’s structural 
concepts. This means that, in this snuly, 
the synergy of language concepts and de- 
sign support contributed the real benefits. 
However, we were convinced chat appro 
priate design support in tsoladon prom- 
ises more payoff than language support in 
isoladon. Our conclusion agrees with 
other experience (in the Ada community, 
for example) that the best language coo 
cepcs are useless without guidelines and 


support for their effective use. 

Later studies at the SEL evaluated the 
potential effect of different design ap- 
proaches on the resulting design docu- 
ments. These results made us question 
our previous conclusion because they re- 
veal^ that the designer’s experience and 
background is much more important 
than the design approach used.^ 

• Architectural design information has 
more influence on maintainability than 
algorichinic design information. Several 
publications have described the relative 
importance of different algorithmically 
oriented design<omplexicy measures. 
Our experience suggests that it may not 
be worth distinguishing among them be- 


Architectural design 

Information ha» man 
Influence on 
maintainetiUty titan 
aigorithmie desigh 
lafonnaHaa. 


raiijig they all seem to be relatively unim- 
portant compared to intercomponent 
complexicy. 

In the Discos/Incas experiment, we 
compared some algorithmic design mea- 
sures (such as lines of code and McCabe’s 
cyclomadc<omplexicy measure), some 
architectural design measures (such as 
Henry and Kafura’s information-flow 
measure), and some hybrid design mea- 
sures (combinations of architectural and 
algorithmic measures) regarding their 
ai^cy to predict maintenance behavior. 

As Figure 2 shows, in isoladon, the algo- 
rithmic measures showed no significant 
corrcladon with maintainability. How- 
ever, the architectural measures did (cor- 
relations in the range of 0.7 to 0.8, with a 
significance level of less than 0.01). The 
hybrid measures had only aslighiiy higher 
correladon with maintainabUicy than the 
architecniral measures, but there was no 
difference among them based on the al- 
gorithmic measure used. Overall, die cor- 
reladons of hybrid design measures with 
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maintainability were only about 0.1 lower 
than the coneladons of the same mea- 
sures computed from source code. 

• The dependency between construc- 
don and measurement is even more obvi- 
ous during design than it was during cod- 
ing. If we believe it is important to 
measure certain architectural design 
product or process aspects, we must en- 
sure their measurability. 

Design product documencadon meth- 
ods vary in formality ^ ranging firom in- 
formal English to (semi) formal graph no 
tadons. Most design produa measures are 
tailored to capture the aspects formally 
specified according to a specific method. 
Thus, they cannot be applied across envi- 
ronments that use diffinent design meth- 
ods. 

The creadve nature of the design pro- 
cess means that many aspects cannot be 
formalized, and consequently measured 
ai all. While forraalizadon (and conse- 
quent automation) is a soludon for more 
mechanical processes (such as comptlar 
don), it is not feasible for design pro- 
cesses. The only feasible way to make com- 
plex creadve processes more manageable 
and measurable is to divide them into 
smaller processes with well-defined incer- 
foces that can be checked — the divide- 
and<onquer principle. 

In the Distos/Incas project, we applied 
the divide-and-conquer principle in the 
form of a stepwise, refinement-oriented 
design process. (The Qcanroom method 
uses a similar but much more formal ap- 
proach.^ In our approach, formal specifi- 
cadons were iteradvely refined into lower 
level spedficadons. After each refinement 
step, the result is proven correct with re- 
spect to the input spedfication. This ap- 
proach let us control the design process 
and lent itself to measurement (such as 
the number of design decisions and how 
much complexicy each design step adds to 
the design document) . 

Design predicability. We must develop 
ways to predict maintainability with archi- 
tectural design measures. This means chat 
we must understand the relarionship be- 
tween a component’s design characteris- 
dcs and its maintenance behavior. 

In the Distos/Incas project, we used our 
stepwise design approach and measured 
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Rfm 2. The capabitties of complexrty measures to predict mairrtainabilrty dunng design and coding phases. 


the architecture of Lady components. We 
then measured their maintenance behav- 
iorwith some efibn- and errophased mea> 
sures. 

From this, we learned thac 

• You con use design measures to predict 
maintainability. Generally, the design 
phase IS considered to be where the signa- 
ture of a system is created. If we can mea- 
sure during thac phase, we should be able 
to use this infbrmadon to predict many 
process and product aspects as the life 
cycle progresKS. 

In the Distos/Incas experiment, we 
found that design measures could predict 
maintenance, locality, isolation effort, 
modification, and understandabilicy al- 
most as well as the corresponding code 
measures. Some of the measures were ap- 
plicable as early as the endofarchitecnirai 
design. 

• We should expand the definition of 
design measures. The Distos/Incas exper- 
iment supports the belief that true Ir/er- 
age is possible firom measuring and un- 
derstanding the architecmrai aspects of 
the design product and process, as op- 

March1990 


posed to the algorithmic aspects mea- 
sured by traditional design measures. For 
example, design-process measures could 
capture design effort, errors commioed 
and corrected during design, the effec- 
tiveness of design methods in supporting 
fundaroeniai design principles, and the 
human aspect in assessing design altem»> 
dves and resolving conflicts. 

In the Distos/Incas experiment, we con- 
centrated on measuring the strucmral as- 
pects of design products. However, we also 
evaluated the stability of designs created 
according to design methods that sup 
ported different structural language con- 
cepts. It was very clear firom comparisons 
of the evolving design versions and from 
the designers’ comments that the design 
method tailored to the Lady language 
(which identifies three structural levels) 
resulted in fewer redesigns than the 
method tailored to cradidonal languages 
(which typically identify only two struc- 
tural levels^). 

In the SEL, we use a wide spectrum of 
design measures, ranging from subjective 
measures that capture the human experi- 


ence with design methods, to measures 
that capture the effectiveness of design 
methods in preventing certain errors, to 
effort and error measures.^ 

• It is important to document all design 
decisions. It has long been recognized 
thac missing design information makes it 
extremely diffictilt to maintain software 
efficiently. While the final design is impor- 
tant, the design rationale is at least as im- 
portant if you are to understand design 
decisions and avoid recreating previously 
rejected design alternatives. 

In the Distos/Incas experiment, we 
used more explicit design documents 
than are used in most production envi- 
ronments. However, the informadon-fiow 
measures derived from the final design 
document had only average predictive ca- 
pabilides. Further analysis revealed that 
whenever a component had implicit de- 
pendencies with other components its 
maintenance behavior was poorly pre- 
dicted. Implicit dependencies between 
components included the use of the same 
constant, the use of the same algorithm, 
and architectural dependencies.* 
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These design decisions were not re- 
flected cxpiidily in the final design docu- 
menL Fortunately, we had scored all the 
versions created during development, so 
we could do a postmortem analysis to 
identic many implidc dependencies. Thu 
caused us to extend Henry and Kafura’s 
information-flow model with implicit, 
global information flows.* The new design 
measure, which combines explicit and im- 
plicit information flows, was significantly 
more reliable in predicting maintenance 
behavior. 


Research frame wofk 

Measurement is useful to understand. 


control, and improve products and pro- 
cesses based on objective data rather than 
subjective judgment. It also helps you 
build better models of processes and 
products. However, successful measure- 
ment requires more than a set of mea- 
sures, just as successful design requires 
more than a set of design tools. 

I suggest the following comprehensive 
design-measurement framework, which 
includes measurement approaches, 
mechanisms to model design aspects, the 
entire range of candidate design mear 
sures, and guidelines for reporting de- 
sign-measurement results: ‘ 

• Choose and tailor an effective mear 
surement approach. I suggest the GQM 
and quality-improvement paradigms for 
both individual experiments and case 
studies as well as continuous organtzar 


tional improvement. Both paradigms the- 
oretically can exist without measurement, 
but you must measure if you want to cvalu^ 
ace and improve based on objective data 
rather than just subjective judgment. 

Both paradigms incorporate measure- 
ment in a goal-oriented fashion: Measures 
serve goals! Both must be instantiated 
into an operational approach tailored to 
the specific environment characierisdcs.^ 
In the TAME project, we developed tem- 
plates and guidelines to help formally sup- 
port the setbng of goals and the derivadon 
of measures.^ 

• Model the design aspects of interest. 
To use the paradigms properly, you must 
model the product and process aspects of 
interesL The product aspects of interest 
are those addressed and documented 
during the design phase ( such as data and 


intercomponent structure, and control 
and information flow). The process as- 


pects of interest are harder to model In a 
separate project at the Universicy of Mary- 
land, we are developing a process-model- 
ing language that acknowledges the need 
to specify mechanical and creanvc design 
aspects by combining imperative and con- 
snainc-oriented language prindples.^ 

• Consider a variety of design measures. 
Candidate design measures address the 
design process and product, charaaerize 
design aspects directly and tise design 
measures as indirect measures to help pre- 
dict ocher qualities of interest (such as 
maintainability) , and represent design in- 
formation objectively, subjectiveiy, and on 


different scales. 

Design-process measures may capture 
effort dischbudons, defect profiles, or pat- 
terns of design<onflict resolutions. De- 
sign-product measures include measures 
of length, structural complexity, data- 
structure and dataflow complexity, and in- 
formation-structure and information- 
flow complexity. 

A direct design measure characterizes a 
design aspect. In comparing Lady systems 
CO systems implemented in a tradirionai 
language, the measure “structural com- 
plexity in terms of incoming and outgoing 
information flows” was used as a direct 


measure of design-product complexity 
and the measure “effort in staff-hours 
spent on designing* as a direct measure of 
design cost. 

An indirect design measure helps pre- 
dict the expected value of a direct mca^ 
sure. To measure maintainability, mean- 
ingful direct measures might be “effon 
per maintenance change." The indirect 
design measure “structural complexity" 
has been idendfled in the Distos/ Incas ex- 


periment to be a useful indirect measure 
for predicting maintainability. 

Knowing the relationship between indi- 
rect and direct measures for a particular 
characteristic lets you predict whether re- 
quirements for this characteristic can be 
foifilled and in turn, where necessary, co 
correct developments. 

Objective design measures are pre- 
ferred over subjective design measures. 
Examples of typical objecuve measures 
are “effort in staff hours spent on design" 
and “number of design components,* Ex- 


amples of typical subjective measures are 
“degree to which a design method was 
used" and “experience of staff with the de- 
sign method.” It is important to under- 
stand the scale of a given design measure 
and the corresponding tmpLicadons on its 
inierprctability. 

• Define guidelines for reporting mea- 
surement results. The GQM and quaiicy- 
improvemeni paradigms provide not only 
a good context for measurement but 
sound guidelines for reporting measure- 
ment results as well. You can use the steps 
of the quality-improvement paradigm as a 
structure to report results: * 

1. Characterize the environment co the 
degree necessary to understand the mear 
surement goals, the experimental design, 
and the data interpretations. 

2- Describe the measurement goals. 

3. Desenbe the measures chosen. 

4. Describe the experimental design, 
including procedures for data collection , 
validation, and analysts, as well as hypoth- 
eses. 

5. Characterize the collected data. 

6. Present the analysis results and vali- 
date the hvpothcscs. 

7. Summarize the concribudon of the 
results to the original goals and outline 
possible lessons for future measurement 
casks: 

E ffective design measurement 
promises to contribute to quality 
and productivity. Design measure- 
ment has many dimensions and should be 
closely tied to the design methodology 
used. There are components of design- 
measurement technology available today, 
including general measurement ap- 
proaches — the first TAME prototype is 
composed largely of available measure- 
ment technology.* 

Design-measurement areas that require 
further research include the develop- 
ment of tractable (or measurable) design 
methods, the further formalization of 
measurement approaches, the identifica* 
don of important design pnnciplcs that 
need to be better undersio^ through de- 
sign measures, the integration of con- 
struction and measurement, and the 
quandficadon of intellectual design activi- 
ucs such as exploring and rejecung design 
aliemauvcs. 
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Institute for Advanced Computer Studies and 
Department of Computer Science 
University of Maryland 
College Park, MD 20742 


ABSTRACT 

Reuse of products, processes and related knowledge will be the key to enable the 
software industry to achieve the drarnatic improvcinent in productivity and quality required to 
satisfy the anticipated growing demands. We need a comprehensive framework of models and 
model-based characterization schemes for better understanding, evaluaung, and planning all as- 
pects of reuse. In this paper we define requirements for comprehensive reuse models and relat- 
ed characterization schemes, assess state-of-the-art reuse characterization schemes relative to 
these requirements and motivate the need for mote comprehensive reuse characterization 
schemes. We introduce a characterization scheme based upon a general reuse model, apply it 
and discuss its benefits, and suggest a model for integrating reuse into software development. 


*Reje»reh for this study w«s supported in p»tt by NASA grant NSG-5123, ONR grant N00014-87-K.0307 and Airmics grant 
DE-nmi-840R21400 lo the University of Maryland. 
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1. INTRODUCTION 


The existing gap between demand and our ability to produce high quality software cost- 
effectively calls for an improved software development technology. A reuse oriented development 
technology can significantly contribute to higher quality and productivity. Quality should 
improve by reusing proven experience in the form of products, processes and related knowledge 
such as plans, measurement data and lessons learned. Productivity should increase by using 
existing experience rather than creating everything from scratch. Many different approaches to 
reuse have appeared in the literature (e.g., [7, 9, 11, 13, 14, 15, 16, 21, 22, 23]). 

Reusing existing experience is a key ingredient to progress in any area. Without reuse 
everything must be re-learned and re-created; progress in an economical fashion is unlikely. 
The goal of research in the area of reuse is the achievement of systematic approaches for effec- 
tively reusing existing experience to maximize quality and cost benefits. 

This paper defines and demonstrates the usefulness of model-based reuse characterization 
schemes. From a number of important assumptions regarding the nature of software development 
and reuse we derive four essential requirements for any useful reuse models and related character- 
ization schemes (Section 2). Existing models and characterization schemes are assessed with 
respect to these assumptions and the need for more comprehensive models and characterization 
schemes is established (Section 3), We introduce a reuse characterization scheme based on a gen- 
eral model of reuse (Section 4), and discuss its practical application and benefits (Section 5). 
Throughout the paper we use examples of reusing generic Ada packages, design inspections, and 
cost models to demonstrate our approach. Finally, we present a model for integrating and sup- 
porting reuse in software development (Section 6). 
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2. BASIC REQUIREMENTS FOR A REUSE CHARACTERIZATION SCHEME 

The reuse approach presented in this paper is based on a number of assumptions regarding 
software development in general and reuse in particular. These assumptions are based on more 
than ten years of analyzing software processes and products [1, 3, 4, 5, 6, IQ]. This section states 
our assumptions regarding development and reuse (Sections 2.1 and 2.2, respectively), and derives 
a set of characteristics required for any useful reuse characterization scheme (Section 2.3). 


2.1 . Software Development Assumptions 

According to a common software development project model depicted in Figure 1, the goal 
of software development is to produce project deliverables (i.e., project output) that satisfy pro- 
ject needs (i.e., project input) [25l. This goal is achieved according to some development process 
model which coordinates personnel, practices, methods and tools. 



Figure 1: Software Development Project Model 
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With regard to software development we make the following assumptions: 


(Dl) A single software development process model cannot be assumed for all software 
development projects: Different project needs and other project characteristics may suggest 
and justify different development process models. The potential differences may range from 
different development process models themselves to different practices, methods and tools sup- 
porting these development process models to different personnel. 

(D2) Practices, methods and tools - including reuse-related ones - need to be tailored 
to the project needs and characteristics: Under the assumption that practices, methods 
and tools support a particular development project, they need to be tailored to the needs and 
objectives, development process model, and other characteristics of that project. 


2.2. Software Reuse Assumptions 

Reuse-oriented software development (depicted in Figure 2) assumes that, given the 
project-specific need to develop an object that meets specification 'T, we take advantage of 
some already existing object ’xj^' e instead of developing ’x^ from scratch. In this 

case, Ic* is not only the specification for ’x’ but abo the reuse specification for the set of reuse 
candidates ’x/, ..., *x \ Reuse includes the identification of a set of reuse candidates {’x,’, 

the evaluation of their potential to satisfy reuse specification 'T effectively and the 
selection of the best-suited candidate ^Xj^\ the possible modification of the chosen candidate 
into ’x^ and the integration of into the development process of the current project. 
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Figure 2: Reuse-Oriented Software Development Model 


With regard to software reuse we make the following assumptions; 

(Rl) All experience can be reused; Typically, the emphasis is on reusing objects of type 
’source code’. This limitation reflects the traditional view that software equals code. It ignores 
the importance of reusing software products across the entire life-cyele (which includes the 
planning as well as the production phases of a software development project), software 
processes and methods, and other kinds of knowledge such as modeb, measurement data or les- 
son5 learned. 


The reuse of ’generic Ada packages’ represents an example of product reuse. Generic Ada pack- 
ages represent templates for instantiating specific package objects according to a parameter 
mechanisms. The reuse of ’design inspections’ represenU an example of process reuse. Design 
inspections are off-line fault detection and isolation methods applied during the module design 
phase. They can be based on different techniques for reading (e.g., ad hoe, sequential, control 
flow oriented, stepwise abstraction oriented). The reuse of 'cost models’ represents an example 
of knowledge reuse. Cost models are used in the estimation, evaluation and control of project 
cost. They predict cost (e.g., in the form of staff-months) based on a number of characteristic 
project parameters (e.g., estimated product size in KLoC, product complexity, methodology 
ievcij. 

(R2) Reuse typically requires some modification of the object being reused; Under the 
assumption that software developments may be different in some way, modification of 
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experience from prior projects must be anticipated. The degree of modification depends on how 
manyi and to what degree, existing object characteristics differ from their desired characteris- 
tics. 

To reuse an Ada package 'list of integers' to organize a 'list of reals' we need to modify it. We 
can ct^Acr modify the existing package by hand^ or we can use a generic package 'list' which can 
be instantiated via a parameter mechanism for any base type. 

To reuse a design inspection method across projects characterized by significantly different fault 
profileSf the underlying reading technique may need to be tailored to the respective fault profiles. 
If 'interface faults' replace 'control flow faults' as the most common fault type, we can either 
select a different reading technique all together (e.g., step-wise abstraction instead of control- 
flow oriented) or we can establish specific guidelines for identifying interface faults. 

To reuse a cost model across projects characterized by different application domains, we may 
have to change the number and type of characteristic project parameters used for estimating 
cost as well as their impact on cost. If 'commercial software' is developed instead of 'real-time 
software*, we may have to consider re-defining 'cs^tma^cd product size' to be measured in terms 
of 'data structures’ instead of 'lines of code’ or re-computing the impact of the existing parame- 
ters on cost. Using a cost model effectively «mp/i«s a constant updating of our understanding of 
the relationship between project parameters and cost. 


(R3) Analysis is necessary to determine when and if reuse is appropriate: The decision 
to reuse existing experience as well as how and when to reuse it needs to be based on an 
anadysis of the payoff. Reuse payoff is not always easy to evaluate. We need to understand (i) 
the objectives of reuse, (ii) how well the available reuse candidates are qualified to meet these 
objectives, and (iii) the mechanisms available to perform the necessary modification. 


Assume the existence of a set of Ada generics which represent application-specific components 
of a satellite control system. The objective may be to reuse such components to build a new 
satellite control system of a similar type, but with higher precision. Whether the existing gener- 
ics are suitable depends on a variety of characteristics: Their correctness and reliability, their 
performance in prior instances of reuse, their ease of integration into a new system, the poten- 
tial for achieving the higher degree of precision through ms^anhahon, the degree of change 
needed, and the existence of reuse mechanisms that support this change process. Candidate 
Ada generics may theoretically be well suited for reuse; however, without knowing the answers 
to these questions, they may not be reused due to lack of confidence that reuse will pay off. 

Assume the existence of a design inspection method 6ased on ad-hoc reading which has been 
used sttccess/ti//y on past satellite control software developments within a standard waterfall 
model. The objective may be to reuse the method in the context of the Cleanroom development 
me^Aod [18, 20 j. In this case, the me^Aod needs to be applied in the context of a different life- 
cycle model, different design approach, and different design representations. Whether and how 
the existing method can be reused depends on our ability to tailor the readtny technique to the 
s^epti^tse refinement oriented design fecAnt^ue used in Cleanroom, and the required intensity of 
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reading due to the omiseion of developer testing. Tkts results tn the defimtton of the stepwtse 
abstraction oriented reading technique [Sj. 

Assume the existence of a cost model that has been validated for the development of satcihte 
control software based on a waterfall life-cycle model, functional decomposition oriented design 
techniques, and functional and structural testing. The objective may be to reuse the model in 
the context of Cleanroom development. Whether the cost model can be reused at all, how it 
needs to be calibrated, or whether a completely different model may be more appropriate 
depends on whether the model contains the appropriate variables needed for the prediction of 
cost change or whether they simply need to be re-calibrated. This question can only be answered 
through thorough ana/yaiJ of a number of Cleanroom projects. 

(R4) Reuse must be integrated into the specific software development: Reuse is intended 
to make software development more effective. In order to achieve this objective we need to 
tailor reuse practices, methods and tools towards the respective development process. 


We have to decide when and how to identify, modify and integrate existing Ada packages. If we 
assume identification of Ada generics by name, and modification by the generic parameter 
mechanism, we require a repository consisting of Ada generics together with a description of the 
instantiation parameters. If we assume identification by specification, and modification of the 
generic’s code by hand, we require a suitable specification of each generic, a definition of 
semantic closeness of specifications so we can find suitable reuse candidates, and the appropri- 
ate source code documentation to allow for ease of modification. In the case of identification 
by specification we may consider identifying reuse candidates at high-level design (i.e., when the 
component specifications for the new product exist) or even when defining the requirements. 

We have to decide on how often, when, and how design inspections should be integrated into the 
development process. If we assume a waterfall-based development life-cycle, we need to deter- 
mine how many design inspections need to be performed and when (e.g., once for all modules at 
the end of module design, once for all modules of a subsystem, or once for each module). We 
need to state which documents are required as input to the design inspection, what results are 
to be produced, what actions are to be taken, and when, in ease the results are insufficient, and 
who is supposed to participate. 

We have to decide when to initially estimate cost and when to update the initial estimate. If we 
assume a waterfall-based development life-cycle, we may estimate cost initially based on 
estimated product and process parameters (e.g., estimated product size). After each milestone, 
the estimated cost can be compared with the actual cost. Possible deviations are used to correct 
the estimate for the remainder of the project. 


2.3. Software Reuse Characteristics 

The above software reuse assumptions suggest that ’reuse' is a complex concept. We need co 
build models and characterization schemes that allow us to define and understand, compare and 
evaluate, and plan the objectives of reuse, the candidate objects of reuse, the reuse process itself, 
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and the potential for effective reuse. Based upon the above assumptions, such models and charac- 
terization schemes need to exhibit the following characteristics: 

(Cl) Applicable to all types of reuse objects: We want to be able to characterize products, 
processes and all other types of related knowledge such as plans, measurement data or lessons 
learned, 

(C2) Capable of characterising objects-before-reuse and objects-after— reuse: We want 
to be able to characterize the reuse candidates (from here on called ’objects-before-reuse') as 
well as the object actually being reused in the current project (from here on called 'object- 
after-reuse'). This will enable us to (i) judge the suitability of a given reuse candidate based on 
the distance between its actual before— reuse and desired after— reuse characteristics, and (ii) 
establish criteria for useful reuse candidates (object-before-reuse characteristics) based on anti- 
cipated objectives for their (re)use (object-after-reuse characteristics), 

(C3) Capable of characterising the reuse process itself: We wane co be able to (i) judge 
the ease of bridging the gap between different object characteristics before— and after-reuse, 
and (ii) derive additional criteria for useful reuse candidates based on characteristics of the 
reuse process itself. 

(C4) Capable of being systematically tailored to specific project (i-e., development 
and reuse) needs and other characteristics: We want to be able to adjust a given reuse 
characterization scheme to changing needs in a systematic way. This requires not only the abil- 
ity to change the scheme, but also some kind of rationale that ties the given reuse characteriza- 
tion scheme back to its imderlying model and assumptions. Such a rationale enables us to 
identify the impact of different environments and modify the scheme in a systematic way. 
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3. STATE-OF-THE-ART REUSE CHARACTERIZATION SCHEMES 


A number of research groups have developed characterization schemes for reuse (e.g., [9, 11, 
13, 21, 22j). The schemes can be distinguished as sptcial purpose schemes and meta schemes. 

The large majority of published characterization schemes have been developed for a special 
purpose. They consist of a fixed number of characterization dimensions. There intention is to 
characterize software products as they exist. Typical dimensions for characterizing source code 
objects in a repository are "function", "size", or "type of problem". Examples schemes include 
the schemes published in [11, 13], the ACM Computing Reviews Scheme, AFIPS’s Taxonomy of 
Computer Science and Engineering, schemes for functional collections (e.g., GAMS, SHARE, SSP, 
SPSS, IMSL) and schemes for commercial software catalogs (e.g., ICP, IDS, IBM Software Cata- 
log, Apple Book). It is obvious that special purpose schemes are not designed to satisfy the reuse 
modeling characteristics of section 2.3. 

A few characterization schemes can be instantiated for different purposes. They explicitly 
acknowledge the need for different schemes (or the expansion of existing ones) due to different or 
changing needs of an organization. They, therefore, allow the instantiation of any imaginable 
scheme. An excellent example is Ruben Prieto-Diaz^s facet-based meta-characterization scheme 
[X4, 17] . Theoretically, meta schemes are flexible enough to allow the capturing of any reuse 
aspect. However, based on known examples of actual uses of meta schemes, such broadness seems 
not intended. Instead, most examples focus on product reuse, are limited to the objects-before- 
reuse, and ignore the reuse process entirely. Meta schemes were also not designed to satisfy the 
reuse modeling characteristics of section 2.3. 

We have found that existing schemes - special purpose as well as meta schemes - do not 
satisfy our requirements. To illustrate the problems associated with their limitations, we use the 
following example scheme which can be viewed either as a special-purpose scheme or a specihe 
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instantiation of a meta scheme : 


Eaeli reuse candidate is characterized in terms of 

• name: What is the object's name? (e.g,, buffer.ada, seMnspection, sel_cost_model) 

• function: What is the functional specification or purpose of the object? (e.g., integer_queue, 
<element> buffer, sensor control system, certify appropriateness of design documents, 
predict project cost) 

• use: How can the object be used? (e.g., product, process, knowledge) 

• What type of object is it? (e.g., requirements document, code document, inspection 
method, coding method, specification tool, graphic tool, process model, cost model) 

• granularity: What is the object's scope? (e.g., system level, subsystem level, component 
level, module - package, procedure, function -- level, entire life cycle, design stage, coding 
stage) 

• representation: How is the object represented? (e.g., data, informal set of guidelines, 
schematized templates, formal mathematical model, languages such as Ada, automated tools) 

• input/output: What are the external input/output dependencies of the object needed to 
completely define/extract it as a self-contained entity? (e.g., global data referenced by a 
code unit, formal and actual input/output parameters of a procedure, instantiation parame- 
ters of a generic Ada package, specification and design documents needed to perform a design 
inspection, defect data produced by a design inspection, variables of a cost model) 

• dependencies: What are additional assumptions and dependencies needed to understand the 
object? (e.g., assumption on user's qualification such as knowledge of Ada or qualification to 
read, specification document to understand a code unit, readability of design document, 
homogeneity of problem classes and environments underlying a cost model) 

• application domain: What application classes was the object developed for? (e.g. ground 
support software for satellites, business software for banking, payroll software) 

• solution domain: What environment classes was the object developed in? (e.g., waterfall 
life-cycle model, spiral life-cycle model, iterative enhancement life-cycle model, functional 
decomposition design method, standard set of methods) 

• object quality: What qualities does the object exhibit? (e.g., level of reliability, correctness, 
user-friendliness, defect detection rate, predictability) 


Let's assess the above reuse characterization scheme relative to the four desired characteristics of 
section 2.3: 

(Cl) It is theoretically possible to characterize all types of experience according to the above 
scheme (in case of a meta scheme we could even create new ones). For example, a generic Ada 
package 'buffer.ada' may be characterized as having identifier 'buffer.ada', offering the function 
’< element > buffer’, being usable as a 'product' of type 'code document' at the 'package 
module level', and being represented in 'Ada'. The self-contained definition of the package 
requires knowledge regarding the instantiation parameters as well as its visibility of externally 
• Characterization dimensions are marked with example categories for each dimension are listed in parenthesis. 
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defined objects (e.g., explicit access through WITH clauses, implicit access according to nesting 
structure). In addition, effective use of the object may require some basic knowledge of the 
language Ada and assume thorough documentation of the object itself. It may have been 
developed within the application domain Aground support software', according to a 'waterfall 
life-cycle' and 'functional decomposition design', and exhibiting high quality in terms of 'relia- 
bility'. 

(C2) The scheme is used to characterize reuse candidates (i.e., objects-before-reuse) only. How- 
ever, in order to evaluate the reuse potential of an object-before-reuse in a given reuse 
scenario, one needs to understand the distance between its characteristics and the characteiis- 
tics of the desired object (i.e., object— after— reuse). In the case of the Ada package example, the 
required function may be different, the quality requirements with respect to reliability may be 
higher, or the design method used in the current project may be different from the one accord- 
ing to which the package has been created originally. Without understanding the distance to 
be bridged between reuse requirements and reuse candidates it is hard to (a) predict the cost 
involved in reusing a particular object, and (b) establish criteria for populating a reuse reposi- 
tory that supports cost-effective reuse. 

(C3) The scheme is not intended bo characterize the reuse process at all. To really predict the 
cost of reuse we do not only have to understand the distance to be bridged between objects— 
before and objects-after-reuse (as pointed out above), but also the intended process to bridge it 
(i.e., the reuse process). For example, it can be expected that it is easier to bridge the distance 
with respect to function by using a parameterized instantiation mechanism rather than modify- 
ing the existing package by hand. 

(C4) Their is no explicit rationale for the eleven dimensions of the example scheme. That makes 
it hard to reason about its appropriateness as well as modify it in any systematic way. There 
is no guidance in tailoring the example scheme to new needs neither with respect to what is to 
changed (e.g., only some categories, dimensions, or the entire implicitly underlying model) nor 
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how it is to be changed. 


The result of this assessment suggests the urgent need for new, better reuse characterization 
schemes. In the next section, we suggest a model-based scheme which satisfies all four characteris- 
tics. 


4. MODEL-BASED REUSE CHARACTERIZATION SCHEMES 

In this section we define a model-based reuse characterization scheme satisfying the charac- 
teristics (Cl-4) stated in section 2.3. We start this modeling approach with a very general reuse 
model satisfying satisfying the reuse assumptions, refine it step by step until it generates reuse 
characterization dimensions at the level of detail needed to understand, evaluate, motivate or 
improve reuse. This modeling approach allows us to deal with the complexity of the modeling 
task itself, and document an explicit rationale for the resulting model. 


4.1. The Abstract Reuse Model 

The general reuse model used in this section is consistent with the view of reuse represented 
in section 2.2. It assumes the existence of objects-before-reuse and objects-after-reuse, and a 
transformation between the two: 
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BEFORE 

REUSE 



PROCESS 

AFTER 

REUSE 


Fig\ire 3; Abstract Reuse Model (Refinement level 0) 


The objects-before-reuse represent experience from prior projects, have been evaluated as being 
of potential reuse value, and have been made available in some form of a repository. The 
objects-after-reuse are the (potentially modified) versions of objects-before-reuse integrated into 
some project other than the one they were initially created for. Object-after-reuse characteristics 
represent the ’reuse specification' for any candidate 'object-before-reuse'. Both the objects- 
before-reuse and the objects-after-reuse may represent any type of experience accumulated in the 
context of software projects ranging from products to processes to knowledge. The reuse process 
transforms objects-before-reuse into objects-after-reuse. 


4.2. The First Model Refinement Level 

Figure 4 depicts the result of the first refinement step of the general model of Figure 3, 
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OBJECTS 
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Figure 4; Our Reuse Model (Refinement level 1) 


Each objcct-beforc-rcusc is a specific candidate for reuse. It has various attributes that 
describe and bound the object. Most objects are physically part of a system, i.e. they interact 
with other objects to create some greater object. If we want to reuse an object we must under- 
stand its interaction with other objects in the system in order to extract it as a unit, i.e. object 
interface. Objects were created in some environment which leaves its characteristics on the 
object, even though those characteristics may not be visible. We call this the object context. 

The object-after-reuse is a specification for a set of before-reuse candidates. Therefore, we 
may have to consider different attributes. The system in which the transformed object is 
integrated and the system context in which the system is developed must also be classified. 

The reuse process is aimed at extracting the object-before-reuse from a repository based on 
the available objcct-after-reuse characteristics, and making it ready for reuse in the system and 
context in which it will be reused. We must describe the various reuse activities and classify 
them. The reuse activities need to be integrated into the reuse-enabling software development 
process. The means of integration constitute the activity interface. Reuse requires the transfer of 
experience across project boundaries. The organizational support provided for this experience 
transfer is referred to as activity context. 


- 14 - 


3-16 


6109 








Based upon the goals for the specific project, as well as the organization, we must evaluate 
(i) the required qualities of the object-after-reuse, (ii) the quality of the reuse process, especially 
its integration into the enabling software evolution process, and (iii) the quality of the existing 
objects-before-reuse. 

4.3. The Second Model Refinement Level 

Each component of the First Model Refinement (Figure 4) is further refined as depicted in 
Figures 5(a-c) . It needs to be noted that these refinements are based on our current understand- 
ing of reuse and may, therefore, change in the future. 

4.3.1. O b j ects-Bef ore-Reuse 

In order to characterize the object itself, we have chosen to provide the following six dimen- 
sions and supplementing categories: the object’s name (e.g., buffer.ada), its function (e.g., 
integer^buffer), its possible use (e.g,, product), its type (e.g.. requirements document), its granu- 
larity (e.g., module), and its representation (e.g., Ada language). The object Interface consists of 
such things as what are the explicit inputs/outputs needed to define and extract the object as a 
self-contained unit (e.g., instantiation parameters in the case of a generic Ada package), and what 
are additionally required assumptions and dependencies (e.g., user’s knowledge of Ada). Whereas 
the object and object interface dimensions provide us with a snapshot of the object at hand, the 
object context dimension provides us with historical information such as the application classes 
the object was developed for (e.g., ground support software for satellites), the environment the 
object was developed in (e.g.. waterfall life-cycle model), and its validated or anticipated quality 
(e.g., reliability). 

The resulting model refinement is depicted in Figure 5a. 
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Figure 5a: Reuse Model (Objects-^Before-Reuse / Refinement level 2) 


A detailed definition of the above eleven dimensions - together with example categories - 
has already been presented in Section 3. In contrast to Section 3, we now have (i) a rationale for 
these dimensions (see Figure 5a) and (ii) understand that they cover only part (i.e., the objects- 
before-reuse) of the comprehensive reuse model depicted in Figure 4. 


4.3.2. O b j ects- After-Reuse 

In order to characterize objects-after-reuse, we have chosen the same eleven dimensions and 
supporting categories as for the objects— before-reuse. The resulting model refinement is depicted 
in Figure 5b: 
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name 

- function 

- use 

- type 

- granularity 

- representation 


— input/output 

— dependencies 


— application domain 

— solution domain 

— object quality 


Figure 5b: Reuse Model (Objects-After-Reuse / Refinement level 2) 


However, an object may change its characteristics during the actual process of reuse. 
Therefore, its characterizations before-reuse and after-reuse can be expected to be different. For 
example, an object-before-reuse may be a compiler (type) product (use), and may have been 
developed according to a waterfall life-cycle approach (solution domain). The object-after-reuse 
may be a compiler (type) process (use) integrated into a project based on iterative enhancement 

(solution domain). 

This means that despite the similarity between the refined models of objects-before-reuse 
and objects-after-reuse, there exists a significant difference in emphasis; In the former case the 
emphasis is on the potentially reusable objects themselves; in the latter case, the emphasis is on 
the system in which these object(s) are (or are expected to be) reused. This explains the use of dif- 
ferent dimension names: ’system’ and ’system context’ instead of ’object interface’ and object 
context’. 

The distance between the characteristics of an object-before-reuse and an object-after-reuse 
give an indication of the gap to be bridged in the event of reuse. 
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4.3.3. Reuse Process 


The reuse process consists of several activities. In the remainder of this paper, we will use a 
model consisting of four basic activities: identification, evaluation, modification, and integration. 
In order to characterize each reuse activity we may be interested in its name (e.g., modify.pl), its 
function (e.g., modify an identified reuse candidate to entirely satisfy given object-after-reuse 
characteristics), its type (e.g., modification), and the mechanism used to perform its function (e.g., 
modification via parameterization). The interface of each activity may consist of such things as 
what the explicit input/output interfaces between the activity and the enabling software evolution 
environment are (e.g., in the case of modification: performed during the coding phase, assumes 
the existence of a specification), and what other assumptions regarding the evolution environment 
need to be satisfied (e.g., existence of certaun configuration control policies). The activity context 
may include information about how experience is transferred from the object-before-reuse 
domain to the object-after-reuse domain (experience transfer), and the quality of each reuse 
activity (e.g., reliability, productivity). 

This refinement of the reuse process is depicted in Figure 5c. 



name 

fimction 

type 

mechanism 


input/output 

dependencies 


experience transfer 
reuse quality 


Figure 5c; Reuse Model (Reuse Process / Refinement level 2) 


In more detail, the dimensions and example categories for characterizing the reuse process are: 
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• REUSE PROCESS: For each reuse activity characterize: 

-h Activity: 

_ name: What is the name of the activity? (e.5., identify. generics, evaluate.genencs, 
modify.generics, integrate. generics) 

- fimction: What is the function performed by the activity? (e.g., select candidate objects 

{x-} which satisfy certain object categories of the object- after— reuse specification ^ , 
evaluate the potential of the selected candidate objects of satisfying the given system and 
system context dimensions of the object-after-reuse specification 'T and pick the most 
suited candidate modify to entirely satisfy T; integrate object ’x’ into the 

current development project) 

- type: What is the type of the activity? (e.g., identification, evaluation, modification, 
integration) 

- mechanism: How is the activity performed? (in the case of identification: e.g., by name, 
by function, by type and function; in the case of evaluation; e.g., by subjective judgement, 
by evaluation of historical baseline measurement data; in the case of modification: e.g., 
verbatim, parameterized, template-based, unconstrained; in the case of integration: e.g., 
according to the system configuration plan, according to the project/process plan) 

-r Activity Interface: 

- input/output: What are explicit input and output interfaces between the reuse activity 
and the enabling software evolution environment? (in the case of identification: e.g., 
specification for the needed object-after-reuse / set of candidate objects-before-reuse, in 
the case of modification: e.g., one selected object-before-reuse, specification for the needed 
object-after-reuse / object-after-reuse) 

- dependencies: What are other implicit assumptions and dependencies on data and infor- 
mation regarding the software evolution environment? (e.g., time at which reuse activity 
is performed - relative to the enabling development process: e.g., during design or coding 
stages; additional information needed to perform the reuse activity effectively: e.g., pack- 
age specification to instantiate a generic package, knowledge of system configuration plan, 
configuration management procedures, or project plan) 

-h Activity Context: 

- experience transfer: What are the support mechanisms for transferring experience across 
projects? (e.g., human, experience base, automated) 

- reuse quality: What is the quality of each reuse activity? (e.g., high reliability, high 
predictability of modification cost, correctness, average performance) 
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5. APPLYING MODEL-BASED REUSE CHARACTERIZATION SCHEMES 


We demonstrate the applicability of our model-based reuse scheme by characterizing three 
hypothetical reuse scenarios related to product, process and knowledge reuse: Ada generics, design 
inspections, and cost models (Section 5.1). The characterization of the Ada generics scenario is 
furthermore used to demonstrate the benefits of model-based characterizations to 
describe/understand/motivate a given reuse scenario (Section 5.2), to evaluate the cost of reuse 
(Section 5.3), and to plan the population of a reuse repository (Section 5.4). 


5.1. Example Reuse Character! z at ioos 

The characterization scheme of section 4 has been applied to the three examples of product, 
process and knowledge reuse introduced in section 2. The resulting characterizations are contained 
in tables 2, 3, and 4: 
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1 Reuse Examples 

Dimensions 

Ada generic 

design inspection 

cost model 

name 

buffer.ada 

seMnspection. waterfall 

sel cost modei.fortran 

function 

<element>_buffer 

certify appropriateness 
of design documents 

predict 
project cost 

use 

product 

process 

knowledge 

type 

code document, 

inspection method 

cost model 

granularity 

package 

design stage 

entire life cycle 

representation 

Ada/ 

generic package 

informal set of 
guidelines 

formal mathematical 
model 

ioput/output 

formal and actual 
instantiation params 

specification and 
design document needed, 
defect data produced 

estimated product 
size in KLOC, 
complexity rating, 
methodoiogy level, 
cost in staff^hours 

dependencies 

assumes Ada knowledge 

assumes a readable design, 
qualified reader 

assumes a relatively 
homogeneous class 
of problems and environments 

application domain 

ground support 
sw for satellites 

ground support 
sw for satellites 

ground support 
sw for satellites 

solution domain 

waterfall (Fortran) 
life-cycle model, 
functional de- 
composition design 
method 

waterfall (Fortran) 
life-cycle model, 
standard set of 
methods 

waterfall (Fortran) 
life-cycle model 
standard set of 
methods 

object quality 

high reliability 

(e.g., < 0.1 defects 
per KLoC for a given 
set of acceptance tests) 

average defect 
detection rate 
(e.g., > 0.6 defects 
detected per staff^hour) 

average predictability 

(e.g., < 5 % pre- 
diction error) 


Table 2: Characterization of Example Reuse Objects— Before— Reuse 
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Reuse Examples 

Dim«nsioDs 


Ada generics 

design inspection 

cost model 

name 


stringy buffer.ada 

sel_i nspection.cleanroo m 

sel cost model.ada 

function 


string^buffer 

certify appropriateness 
of design documents 

predict 
project cost 

use 


product 

process 

knowledge 

type 


code document. 

inspection method 

cost model 

granularity 


package 

design stage 

entire life cycle 

representation 


Ada 

informal set of 
guidelines 

formal mathematical 
model 

input/output 


formal and actual 
instantiation params 

specification and 
design document needed, 
defect data produced 

estimated product 
size in KLOC, 
complexity rating, 
methodology level, 
cost in sUff_hours 

dependencies 


assumes Ada knowledge 

assumes a readable design, 
qualified reader 

assumes a relatively 
homogeneous class 
of problems and environments 

application domain 


ground support 
3W for satellites 

ground support 
sw for satellites 

ground support 
sw for satellites 

solution domain 


waterfall (Ada) 
life-cycle model, 
object oriented 
design method 

Cleanroom (Fortran) 
development model, 
stepwise refinement 
oriented design, 
statistical testing 

waterfall (Ada) 
life-cycle model, 
revised set of 
methods 

object quality 


high reliability 

(e.g., < 0.1 defects 
per KLoC for a given 
set of acceptance tests), 
high performance 
(e.g., max. response times 
for a set of tests) 

high defect 
detection rate 
(e.g., > 1.0 defects 
detected per staff^hour) 
wrt. interface faults 

high predictability 

(e.g., < 2 % pre- 
diction error) 


Table 3: Characterization of Example Reuse Objects— After-Reiise 
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Reuse Examples 

Dimensions 

Ada generics 

design inspection 

cost model 

nsme 

modify .generics 

modify. Inspections 

modify . c ost^models 

function 

modify to satisfy 
target specification 

modify to satisfy 
target specification 

modify to satisfy 
target specification 

type 

modification 

modification 

modification 

mechanism 

parameterized 
(generic mechanism) 

unconstrained 

template-based 

input/output 

buffer.ada, 
reuse specification/ 
string buffer. ada 

sel inspection.waterfall, 
reuse specification/ 
sel inspection. cleanroom 

se l^cos t_mod e Lfortran , 
reuse specification/ 
sel_cost_model.ada 

dependencies 

performed 

during coding stage, 
package specification 
needed, 

performed 

during planning stage, 

performed 

during planning stage, 


knowledge of 
system configuration 
plan 

knowledge of 
project plan 

knowledge of historical 
project profiles 

experience transfer 

experience base 

human and 
experience base 

human and 
experience base 

reuse quality 

correctness 

correctness 

correctness 


Table 4: Characterization of Example Reuse Processes 


5.2* Describing/Understanding /Motivating Reuse Scenarios 

We will demonstrate the benefits of our reuse characterization scheme to describe, under- 
stand, and motivate the reuse of Ada generics as characterized in section 5.1. 

W^e assume that in some project the need has arisen to have an Ada package implementing 
a 'string_buffer' with high 'reliability and performance’ characteristics. This need may have been 
established during the project planning phase based on domain analysis, or during the design or 
coding stages. This package will be integrated into a software system designed according to 
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object-oriented principles. The complete reuse specification is contained in Table 3, 

First, we identify candidate objects based on some subset of the object related characteris- 
tics stated in Table 3: string_buffer.ada, string_buffer, product, code document, package, Ada, 
The more characteristics we use for identification, the smaller the resulting set of candidate 
objects will be. For example, if we include the name itself, we will either find exactly one object 
or none. Identification may take place during any project stage. We will assume that the set of 
successfully identified reuse candidates contains ^buffer.ada’, the object characterized in Table 2. 

Now we need to evaluate whether and to what degree ’buffer. ada’ (as well as any other 
identified candidate) needs to be modified and estimate the cost of such modification compared to 
the cost required for creating the desired object ’string_buffer’ from scratch. Three characteristics 
of the chosen reuse candidate deviate from the expected ones: it is more general than needed (see 
function dimension), it has been developed according to a different design approach (see solution 
domain dimension), and it does not contain any information about its performance behavior (see 
object quality dimension). The functional discrepancy requires instantiating object 'buffer. ada’ for 
data type ’string’. The cost of this modification is extremely low due to the fact that the generic 
instantiation mechanism in Ada can be used for modification (see Table 4). The remaining two 
discrepancies cannot be evaluated based on the information available through the characteriza- 
tions in section 5.1. On the one hand, ignoring the solution domain discrepancy may result in 
problems during the integration phase. On the other hand, it may be hard to predict the cost of 
transforming ’buffer.ada’ to adhere to object-oriented principles. Without additional information 
about either the integration of non-object-oriented packages or the cost of modification, we only 
have the choice between two risks. Predicting the cost of changes necessary to satisfy the stated 
object performance requirements is impossible because we have no information about the 
candidate’s performance behavior. It is noteworthy that very often practical reuse seems to fail 
because of lack of appropriate information to evaluate the reuse implications a-priori, rather chan 
because of technical infeasibility. 
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In case the object characterized in Table 2 has been modified successfully to satisfy the 
specification in Table 3, we need to integrate it into the ongoing development process. This task 
needs to be performed consistently with the system configuration plan and the process plan used 
in this project. 

The characterization of both objects (before/after-reuse) and the reuse process allow us to 
understand some of the implications and risks associated with discrepancies between identified 
reuse candidates and target reuse specification. Problems arise when we have either insufficient 
information about the existence of a discrepancy (e.g., object performance quality in our exam- 
ple), or no understanding of the implications of an identified discrepancy (e.g., solution domain in 
our example). In order to avoid the first type of problem, one may either constrain the 
identification process further by including characteristics other than just the object related ones, 
or not have any objects without ’performance’ data in the reuse repository. If we had included 
’desired solution domain’ and ’object performance’ as additional criteria in our identification pro- 
cess, we may not have selected object ’buffer.ada’ at all. If every object in our repository would 
have performance data attached to it, we at least would be able to establish the fact that there 
exists a discrepancy. In order to avoid the second type of problem, we need have some (semi-) 
automated modification mechanism, or at least historical data about the cost involved in similar 
past situations. It is clear that in our example any functional discrepancy within the scope of the 
instantiation parameters is easy to bridge due to the availability of a completely automated 
modification mechanism (i.e., generic instantiation in Ada). Any functional discrepancy that can- 
not be bridged through this mechanisms poses a larger and possibly unpredictable risk. Whether 
it is more costly to re-design ’buffer.ada’ in order to adhere to object oriented design principles or 
to re— develop it from scratch is not obvious without past experience. 

Based on the preceding discussion, the motivational benefits are- that we have a sound 
rationale for suggesting the use of certain reuse mechanisms (e.g., automated in the case of Ada 
packages to reduce the modification cost), criteria for populating a reuse repository (e.g., do 
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exclude objects without performance data to avoid the unnecessary expansion of the search 
space), criteria for identifying reuse candidates effectively according to some reuse specification 
(e.g., do include solution domain to avoid the identification of candidates with unpredictable 
modification cost), or certain types of reuse specifications (e.g., require that each reuse request is 
specified in terms of all object dimensions, except probably name, and all system context dimen- 
sions). 

5.3. Evaluating the Cost of Reuse 

We will demonstrate the benefits of our reuse characterization scheme to evaluate the cost 
of reusing Ada generics as characterized in section 5.1. 

The general evaluation goals are (i) characterize the degree of discrepancies between a given 
reuse specification (see Table 3) and a given reuse candidate (Table 2), and (ii) what is the cost of 
bridging the gap between before-reuse and after-reuse characteristics. The first type of evaluation 
goal can be achieved by capturing detailed information with respect to the object-before-reuse 
and object-after-reuse dimensions. The second goal requires the inclusion of data characterizing 
the reuse process itself and past experience about similar reuse activities. 

We use the goal/question/metric paradigm to perform the above kind of goal-oriented 
evaluation [6, 8, 10]. It provides templates for guiding the selection of appropriate metrics based 
on a precise definition of the evaluation goal. Guidance exists at the level of identifying certain 
types of metrics (e.g., to quantify the object of interest, to quantify the perspective of interest, to 
quantify the quality aspect of interest). Using the goal/question/metric paradigm in conjunction 
with reuse characterizations like the ones depicted in Tables 2, 3, and 4, provides very detailed 
guidance as to what exact metrics need to be used. For example, evaluation of the Ada generic 
example suggests metrics to characterize discrepancies between the desired object-after-reuse and 
all before-reuse candidates in terms of (i) function, use, type, granularity, and representation on a 
nominal scale defined by the respective categories, (ii) input/output interface on an ordinal scale 
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’number of instantiation params\ (iii) application and solution domains on nominal scales, and 
(iv) qualities such as performance based on benchmark tests. 


5.4. Planning the Population of Reuse Repositories 

We will demonstrate the benefits of our reuse characterization scheme to populate a reuse 
repository with generic Ada packages as characterized in section 5.1. 

Reuse is economical from a project perspective if the effort required to bridge the gap 
between an object-before-reuse (available in some experience base) and the desired object-after- 
reuse is less than the effort required to create the object-after-reuse from scratch. Reuse is 
economical from an organization's perspective if the effort required for creating the reuse reposi- 
tory is less than the sum of ail project— specific savings based on reuse. 

Based on the above statement, populating a reuse repository constitutes an optimization 
problem for the organization. For example, high effort for populating a reuse repository may be 
justified if (i) small savings in many projects are expected, or (ii) large savings in a small number 
of projects are expected. For example, object ’buffer. ada’ could have been transformed to adhere 
to object oriented principles prior to introducing it into the repository. This would have excluded 
the project specific risk and cost. 

The cost of reusing an object-before-reuse from an experience base depends on its distance 
to the desired object-after-reuse and the mechanisms employed to bridge that distance. The cost 
of populating a reuse repository depends on how much effort is required to transform existing 
objects into objects-before-reuse. Both efforts together are aimed at bridging the gap between the 
project in which some objects were produced and the projects in which they are intended to be 
reused. The inclusion of a generic package ’buffer.ada’ into the repository instead of specific 
instances ’integer buffer. ada’ and ’real-buffer, ad a’ requires some up— front transformation (i.e., 
abstraction). The advantage of creating an object ’buffer.ada’ is that it reduces the project- 
specific cost of creating object ’string_buffer.ada’ (or any other buffer for that matter) and 
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quantifies the cost of modification. 


Finding the appropriate characteristics for objects— before— reuse to minimize project— specific 
reuse costs requires a good understanding of future reuse needs (objects— after— reuse) and the reuse 
processes to be employed (reuse process). The more one knows about future reuse needs within an 
organization, the better job one can do of populating a repository. For example, the object- 
before— reuse characteristics of Ada generics in Table 2 were derived from the corresponding 
object-after-reuse and reuse process characteristics in Tables 3 and 4. It would have made no 
sense to include Ada generics into the experience base that (i) are not based on the same instan- 
tiation parameters as ail anticipated objects-after-reuse because modification is assumed via 
parameterized instantiation, (ii) do not exhibit high reliability and performance, and (iii) have not 
the same solution domain except we understand the implication of different solution domains. 
Without any knowledge of the object-after-reuse and reuse process characteristics, the task of 
populating a reuse repository is about as meaningful as investing in the mass— production of con- 
crete components in the area of civil engineering without knowing whether we want to build 
bridges, town houses or high-rise buildings. 


6. A REUSE-ORIENTED SOFTWARE ENVIRONMENT MODEL 

Effective reuse according to the reuse-oriented software development model depicted in Fig- 
ure 2 of Section 2 needs to take place in an environment that supports continuous improvement, 
i.e., recording of experience across ail projects, appropriate packaging and storing of recorded 
experience, and reusing existing experience whenever feasible. Figure 6 depicts such an environ- 
ment model. 
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Reuse-Oriented Software Environment Model 
Organizational Process Model 
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Figrire 5: Reuse-Oriented Software Environment Model 


Each project is performed according to an organization process model based on the 
mprovement paradigm [2, 5]: 

Characterize: Identify characteristics of the current project environment so that the 







appropriate past experience can be made available to the current project. 

2. Plan: (A) Set up the goals for the project and refine them into quantifiable questions and 
metrics for successful project performance and improvement over previous project performances 
(e.g., based upon the goal/question/metric paradigm [6]). 

(B) Choose the appropriate software development process model for this project with the sup- 
porting methods and tools — both for construction and analysis. 

3. Execute: (A) Construct the products according to the chosen development process model, 
methods and tools. 

(B) Collect the prescribed data, validate and analyze it to provide feedback in real-time for 
corrective action on the current project. 

4. Feedback: (A) Analyze the data to evaluate the current practices, determine problems, record 
findings and make recommendations for improvement for future projects. 

(B) Package the experiences in the form of updated and refined models and other forms of 
structured knowledge gained from this and previous projects, and save it in an experience base 
so it can be available to future projects. 

The experience base is not a passive entity that simply stores experience. It is an active 
organizational entity in the context of the reuse-oriented environment model which - in addition 
to storing experience in a variety of repositories - involves the constant modification of experience 
to increase its reuse potential. It plays the role of an organizational “ server “ aimed at satisfying 
project— specific requests effectively. The constant collection of measurement data regarding 
objects-aftcr-reuse and the reuse processes themselves enables the judgements needed to populate 
the experience base effectively and to select the best suited objects-before-reuse to satisfy 
project-specific reuse needs based upon experiences. The organizational process model based on 
the improvement paradigm supports the integration of measurement-based analysis and construc- 
tion. 


- 30 - 

3-32 


6109 



For more detail about the reuse-oriented environment model, the reader is referred to [7], 


7. CONCLUSIONS 

The model-based reuse characterization scheme introduced in this paper has advantages 
over existing schemes in that it (a) allows us to capture the reuse of any type of experience, (b) 
distinguishes between objects-before-reuse, objects-after-reuse, and the reuse process itself, and 
(c) provides a rationale for the chosen characterizing dimensions. In the past most the scope of 
reuse schemes was limited to objects-before-reuse. 

We have demonstrated the advantages of such a model-based scheme by applying it to the 
characterization of example reuse scenarios. Especially its usefulness for evaluating the cost of 
reuse and planning the population of reuse repositories were stressed. 

Finally, we gave a model how we believe reuse should be integrated into an environment 
aimed at continuous improvement based on learning and reuse. A specific instantiation of such 
an environment, the ’code factory’, is currently being developed at the University of Maryland 
ll2j. In order to make reuse a reality, more research is required towards understanding and con- 
ceptualizing activities and aspects related to reuse, learning and the experience base. 
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Treating mantenance | 
aea reuseariented j 
devetopment process 
provides a choice of 
maintenance 
approaches and 
improve the overall 
evolutUnt process. 


January 1990 


I f you believe that software should be 
de%'eloped with the ^al of maximiz- 
ing the reuse of experience in ihc 
form of knowledge, processes, products, 
and tools, the maintenance process is log- 
ically and ideally suited to a reuse-ori- 
ented development process. There arc 
many reuse models, but the key issue is 
which process model is best suited to the 
maintenance problem at hand. 

In this article, 1 present a high-level or- 
ganizational paradigm for development 
and maintenance in which an organiza- 
tion can leam from development and 
maintenance tasks and then apply that 
paradigm to several maintenance process 
models Associated with the paradigm is a 
mechanism for setting measurable goals 
so you can evaluate the process and the 
product and learn from experience. 


An ewtervenaon of this anicle was givni as the kevnoie 
pmmtaoon at the Conference on SoAware Mainte- 
nance in October 198A. 
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Mantenanca models 

Most software svstems are complex, and 
modification requires a deep un- j 
derstanding of the functional and non- i 
functional requirements, the mapping of , 
functions to system components, and the 
interaction of components. Without good 
documentation of the requirements, de- 
sign. and code with respect to funcuon, 
traceability, and structure, maintenance 
becomes a difftcult, expensive, and error- 
prone task. As early as 1976, Les Bcladv 
and Manny Lehman reported on the 
problems with the evolution of IBM 
OS/360.* The literature is filled with sim- 
ilar examples. 

Maintenance comprises several types of 
aedvides: correcting faults in the system, 
adapting the system to a changing operat- 
ing environment (such as new terminals 
and operadng-system modifications) , and 
adapdng the system to changes in the 
original requirements. The new system is 


19 


3-36 


6109 


Olil system 

New system 

Requiremems 

1 

Requirements i 

1 

1 

Design 

1 


1 

1 

1 

UOOQ — — 
1 

Test 

1 

Test 


Fl0ir« 1. Quick-fix process model. 
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Flgura 2. Interanve-enhancement model. 
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Figure 3. Full-reuse model. 

like the old svstem, yet it is also different in 
a specific set of characteristics. 

You can view the new version of the sys- 
tem as a modificadon of the old system or 
as a new system that reuses many of the old 
system's components. Although these two 
views have many aspects in common, they 
arc very different in how you organize the 
maintenance process, the effects on fu- 
ture products, and the support environ- 
ments required. 

Coruider the following three mainte- 
nance process models: 

• the quick-fix model, 

• the iteradve-enhancement model, and 

• the full-rctisc model. 

All three models reuse the old system 
and so are reuse-oriented. Which model 
you choose for a pardcuJar modificadon is 
determined by a combination of manage^ 
meni and technical decisions that depend 
on the chaiaaerisdcs of the modificadon. 
the future evolution of the produa line, 
and the support environment available. 

Each mc^el assumes that there is a com- 
plete and consistent set of documents de- 


scribing the cxisung system, from require- 
ments through code. Although this may 
be a naive assumpdon in praedee. a side 
effect of this article’s presentadon should 
be to modvaie organizadons to gain the 
benefits of having such documentadon. 

Quick-fix model. The quick-fix model 
represents an absiracdon of the typical ap- 
proach to software maintenance. In the 
quick-fix model, you take the exisdng sys- 
tem, usually just the source code, and 
make the necessary changes to the code 
and the accompanying documentadon 
and recompile the system as a new ver- 
sion. This may be as straightforward as a 
change to some internal component, like 
an error correction involving a single 
component or a structural change or even 
some flmctionaJ enhancemenc 

Figure 1 demonstrates the flow of 
change from the old system's source code 
to the new version's source code. It is as- 
sumed — but not always true — that the 
accompanying documentadon is also up- 
dated. You can view this model as reuse- 
oriented. since you can view the model as 
creadng a new system by reusing the old 
svstem or as simply modifying the old sys- 
tem. However, viewing it in a reuse orien- 
tadon gives you more freedom in the 
scope of change than \icwing it in a modi- 
ficadon or patch orientadon. 

Iteradve-enhancemem model. Iieradve 
enhancemenr is an evoludonary model 
proposed for development in environ- 
ments where the complete set of require- 
ments for a system was not fully un- 
derstood or where the developer did not 
know how to build the full system. Al- 
though iieradve enhancement was pro- 
posed as a development model, it is well 
suited to maintenance. It assumes a com- 
plete and consistent set of documents de- 
scribing the system. The itcradve-cn- 
hancement model 

• starts with the exisdng system's re- 
quirements, design, code, test, and analy- 
sis documents; 

• modifies the set of documents, starting 
with the highest-level document affected 
by the changes, propagadng the changes 
down through the full set of documents; 
and 

• at each step of the evoludonary pro- 
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cess. lets you redesign the system, based 
on analysis of the exisdng system. 

The process assumes that the mainte- 
nance organizadon can analyze the exist- 
ing product, characterize the proposed 
set of modificadons, and redesign the cur- 
rent version where necessary for the nc\v 
capabilides. 

Figure 2 demonstrates the flow of 
change from the highest-level document 
affcaed by the change through the low- 
est-level documenL TTiis model supports 
the reuse orientadon more explicidy. An 
environment that supports the iteradve- 
enhanccmeiu model clearly supports the 
quick-fix model. 

Full -reuse model. While iterative en- 
hancement starts with evaluadng the ex- 
isting system for redesign and modifica- 
tion, a full-reuse process model starts with 
the requirements analvsis and design of 
the new system and reuses the appropri- 
ate requirements, design, and code from 
anv earlier %ersions of the old s\stcm. It 
assumes a repository of documents and 
components defining earlier versions of 
the current svstem and similar svstems. 
The full-reuse model 

• starts with the requirements for the 
new system, reusing as much of the old 
system as feasible, and 

• builds a new system using documents 
and components from the old system and 
from other svstems available in vour re- 
pository; you des'ciop new documen ts and 
components where necessary. 

Here, reuse is explicit, packaging of ex- 
isting components is necessary, and analy- 
sis is required to select the appropriate 
components. 

Figure 3 demonstrates the flow of vari- 
ous documents into the various docu- 
ment repositories (which arc all pan of 
the larger repository) and how those re- 
positories are accessed for documents for 
the new development. There is an asr 
sumption that the items in the repository 
are classified according to a rariciy of 
characieriscics, some of which I describe 
later in the article. 

This repository may contain more than 
Just the documents from the earlier sys- 
tem — - it may contain documents firom 
earlier versions, documents from other 
products in the product line, and some 
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generic reusable forms of documents. An 
environment that supports the full-reuse 
model clearly supports the other wo 
models. 

Mocid differences. The difTerence be- 
tween the last wo approaches is more one 
of perspective than style. The full-rcusc 
model frees you to design the new sys- 
tem s solution from the set of solutions of 
similar systems. The iterative-enhance- 
ment model takes the last version of the 
current system and enhances it. 

Both approaches encourage redesign, 
but the fuli-feuse model offers a broader 
set of items for reuse and can lead to the 
development of more reusable compo- 
nents for future systems. By contrast, the 
itcTative^nhanccmcnt model encourages 
you to tailor existing systems to get the ex- 
tensions for the new system. 

Reuse framework 

The existence of multiple maintenance 
models raises several questions. Which is 
the most appropriate model for a particu- 
lar environment? a particular system? a 
particular set of changes? the task at 
hand? How do you improve each step in 
the process model you have chosen? How 
do you minimize overall cost and maxi- 
mize overall quality? 

To answer these questions, you need a 
model of the object of reuse, a model of 
the process that adapts that object to its 
target application, and a model of the re- 
used object within its target application. 
Figure 4 shows a simple model for reuse. 
In this model, an objea is any software 
process or product and a transformation 
is the set of activides performed when re^ 
using (hato^ecu 

The model steps are 

• identifying the candidate reusable 
pieces of the old object, 

• understanding them, 

• modifying them to your needs, and 

• integradng them into the process. 

To flesh out the model, you need a 
framework for categorizing objects, trans- 
formadons, and their context. The frame- 
work should cover various categories. For 
example, is the object of reuse a process or 
a product? In each category, there are 
various classification schemes for the 
product (such as requirements docu- 
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mem. code module, and test plan) and 
for the process (such as cost esdmadon, 
risk analysis, and design ) . 

Framework dimensioiu. There are a 
variety of approaches to classifying reus- 
able objects, most notably the faceted 
scheme offered by Ruben Pricio-Dtaz and 
Peter Freeman.^ I offer here a scheme that 
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categorizes three aspects of reuse: the re- 
usable object, the reusable object's con- 
texL and the process of transforming that 
object. This scheme owes much to ideas 
presented at the 1987 Minnowbrook 
Workshop on Software Reuse. 

Object dimensions include: 

• ReuscH^bject type. What is a character- 
izadon of the candidate reuse object? 
Sample process classifications include a 
design method and a test technique: 
product classiflcadons include source 
code and requirements documents. 

• Selfcomainedness. How independent 
and understandable is the candidate ob- 
JccL^ Sample classifications include syn- 
taede independence (such as a daia<ou- 
pling measure) and speciHcation 
precision (such as funcdonal notauon 
and English). 

• Reuseotyect quality. How good is the 
candidate reuse ol^ect? Sample classifica- 
tions include mamriry (such as the num- 
ber of systems using it), complexity (such 
as cveiomade complexity) . and reliabiliiv 


(such as the number of failures during 
previous use). 

Context dimensions include: 

• Requirements domain. How similar 
arc the requirements domains of the can- 
didate reuse object and the current proj- 
ect? Sample classiflcadons are application 
(such as ground-support software for sat- 
ellites) and distance (such as same appli- 
cation or similar algorithms but different 
problem focus). 

• Solution domain. How similar are the 
evoiudon processes that resulted in the 
candidate reuse objects and the ones used 
in the current projccL' Sample classifica- 
tions are process model (such as the 
waterfall model), design method (such as 
funedon decomposition | , and language 
(such as Fortran). 

• Knowiedge-tninsfer mechanism- How 
is informadon about the candidate reuse 
objects and their context passed to cur- 
rent and future projects.- People, such as a 
subset of the development team, provide 
a common knowledge-transfer mecha- 
nism. 

Transformauon dim nsions include: 

• Transformation tvpe. How do vou 
characterize transformauon activities? 
Sample classifications include percent of 
change required, direction of change 
(such as general to domain-specific or 
project-specific to domain-speafic ) . mod- 
ificadon mechanism (such as verbatim, 
parameterized, template-based, or un- 
constrained), and idendfication mecha- 
nism (such as by name or by functional 
requirements). 

• Activity imegrauon. How do s-ou inte- 
grate the transformation acuviucs into the 
new system de\*elopmcnt? One sample 
classification is the phase where the acuv- 
ity is performed in the new development 
(for example, planning, requirements de- 
velopment, and design) . 

• Transformed quality. What is the con- 
tribution of the reuse object to the new 
system compared to the objeemes set for 
it? Sample classifications are reliabilitv 
(such as no failures associated with that 
component) and performance (such as 
satisfying a timing requirement) . 
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Comparing the RKKlels. When applying 
the reuse framework to maintenance, the 
SCI of reuse objects is a set of product doc- 
uments. You compare the models to see 
which is appropriate for the current set of 
changes according to the framework’s 
three dimensions. 

Rrsi consider the reuse-object dimen- 
sion: 

The objects of the quick-fix and icera* 
dve-cnhancement models are the set of 
documents representing the old system. 
The objea of the full-reuse model is any 
appropriate document in the repository. 

For self<ontainedness, all the models 
depend on the unit of change. The quick- 
fix model depends on how much evolu- 
tion has taken place, since the s;*siem mav 
have lost structure over time as objects 
were added, modified, and deleted. In it- 
erative enhancement, the evolved sys- 
tem's structure and undersiandabilitv 
should improve with respect to the appli- 
cation and the classes of changes made so 
far. In the full-reuse model, the evolved 
system’s structure, understandabilicy, and 
generality should improve: the degree of 
improvement will depend on the quality 
and maturity of the repository. 

For reuse-object quality, the quick-fix 
model offers little knowledge alx>ut the 
old object’s quality. In iterative enhance- 
ment, the analysis phase provides a fair as- 
sessment of the system's quality. In full 
reuse, you have an assessment of the reuse 
object’s quality across several systems. 

Now consider the context dimensions: 

For the requirements domain, the 
quick-fix and iterative-enhancement 
models assume that you are reusing the 
same application — in ftuii, the same proj- 
ect. The fijll-rcusc model allows manage- 
able variation in the application domain, 
depending on what is available in the re- 
pository. 

For the solution domain, the quick-fix 
model assumes the same solution struc- 
ture exists during maintenance as during 
dcvelopmem. There is no change in the 
basic design or structure of the new sys- 
tem. In iterative enhancement, some 
modification to the solution structure is 
allowed because redesign is a part of the 
model. The ftill-reusc model allows major 
differences in the solution structure: You 
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can completely redesign the system from a 
structure based on functional decomposi- 
tion to one based on object-oriented de- 
sign, for example. 

For the knowledge-transfer mechanism, 
the quick-fix and iterative-enhancement 
models work best when the same people 
are developing and maintaining the sys- 
tem. 'The fuJI-reusc model can compen- 
sate for having a different team, assuming 
that vou have application specialists and a 
well-documented reuse-objea repository. 
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Last, consider the transformation di- 
mension: 

For the transformation type, the quick- 
fix model typically uses activities like 
source-code lookup, reading for un- 
derstanding, unconstrained modifica- 
tion. and recompilation. Iterative en- 
hancement typically begins with a search 
through the highest-level (most abstract) 
document affected by the modification, 
changing it and evolving the subsequent 
documents to be consistent, using several 
modification mechanisms. The full-reuse 
model uses a library search and several 
modification mechanisms: those selected 
depend on the type of change. In full 
reuse, modification is done off-line. 

For activity integration, all activities are 
performed at same time in the quick-fix 
model. Iterative enhancement associates 
the aedvides with all the normal develop- 
ment phases. In full reuse, you identify the 
candidate reusable pieces during project 
planning and perform the other aedvides 
during development. 

For transformed quality, the quick-fix 
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model usually works best on small, well- 
contained modificadons because their ef- 
fects on the s\’stcm can be understood and 
verified in context. Iterauve enhance- 
ment is more appropriate for larger 
changes where the analvsis phase can pro- 
vide better assessment of the full effects of 
changes. Full reuse is appropriate for 
large changes and major redesigns. Here, 
analvsis and performance history of the 
reuse objects support quality. 

Applying the models. Given these differ- 
ences. you can analvze the maintenance 
process models and recommend where 
they might be most applicable. 

But first, consider the reiadonship be- 
tween the dc\elopment and maintenance 
process models: Vbu can consider devel- 
opment to be a subset of maintenance. 
Maintenance environments differ from 
development environments in the con- 
straints on the solution, customer de- 
mand. dmelincss of response, and organi- 
zauon. 

Most maintenance organizations are set 
up for the quick-fix model but not for the 
iteradve-enhancement or full-reuse mod- 
els, since they arc responding to dmcli- 
ness — a svstem failure needs to be fixed 
immediatciv or a customer demands a 
modification of the system s functionalitv. 
This is best used when there is little 
chance the svstem will be modified again. 

Clearly, these arc the quick-fix model’s 
strengths. But its weaknesses are that the 
modification is usually a patch that is not 
well-documented, the structure of the svv 
tern has been partly descroved, making fu- 
ture evoludon of the system difficult and 
error-ridden, and the model is not corn- 
pad ble with dev’ciopmcnt processes. 

The iteradve-enhancement model al- 
lows redesign that lets the system struc- 
ture evolve making future modificadons 
easier. It focuses on making the system as 
good as possible. It is compaublc with de- 
velopment process models. It is a good ap- 
proach to use when the product will have 
a long life and evolve over dme. In this 
case, if dmelincss is also a constraint, you 
can use the quick-fix model for patches 
and the iteradve-enhancement model for 
long-term change, replacing the patches. 
The drawbacks are that it is a more costJv 
and possibly less dmcly approach (in the 
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short run) than the quick-fix model and 
provides litUc support for generic compo- 
nents or future, similar systems. 

The fuU-reusc model gives the main- 
tainer the greatest degree of freedom for 
change, focusing on long-range deselop- 
ment for a set of products, which has the 
side effect of creating reusable compo- 
nents of all kinds for future develop- 
ments. It is compatible with development 
process models and. in fact, is the way de- 
velopment models should evolve. It is best 
used when vou have muitiproduct envi- 
ronments or generic development where 
the product line has a long life. Its draw- 
back is that it is more costly in the short 
run and is not appropriate for small mod- 
ifications (although you can combine it 
with other models for such changes). 

Mv assessment of when to apply these 
models is informal and intuidve, since it is 
a qualiuuiv*e analysis. To do a quantitative 
anaivsis. you would need quantitative 
models of the reuse objects, trans- 
formauons, and context. You would need 
a measurement framework to character- 
ize (via classification), evaluate, predict, 
and motivate management and technical 
decisions. To do this, you would need to 
apply to the models a mechanism for gen- 
erating and interpreting quantitative 
measurement, like the goal/ques- 
tion /metric paradigm."*^ (See the box on 
p. 24 for a description of this paradigm 
and its application to choosing the appro- 
priate maintenance process model.) 



There are many support mechanisms 
necessarv to achieve maximum reuse that 
have not been sufficiently emphasized in 
the literature. In this articie, I have pre- 
sented several: a set of maintenance mod- 
els, a mechanism for choosing the appro- 
priate models based on the goals and 
characteristics of the problem at hand, 
and a measurement and evaliiadon mech- 
anism. To support these activities, there is 
a need for an improvement paradigm that 
helps organizations evaluate, learn, and 
enhance their software processes and 
products, a reuse-oriented evolution envi- 
ron ment that encourages and supports 
rcxisc. and automated support for both 
the paradigm and environment as well as 
for measurement and evaluation. 
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I m prove m ent paradigm* The improve- 
ment paradigm** is a high-tevel organiza- 
tional process model in which the organ!- 
zation learns how to improve its products 
and process. In this model, the organiza- 
tion should learn how to make better deci- 
sions on which process model to use for 
the maintenance of its future products 
based on past performance. The parar 
digm has three parts: planning, analysis, 
and learning and feedback. 

In planning, there are three integrated 
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activities that are iteratively applied: 

• Characterize the current project envi- 
ronment to provide a quantitative analysis 
of the environment and a model of the 
project in the context of that environ- 
ment. For nuuntenance, the characteriza- 
tion provides product-dimension data, 
change and defect data, cost dau and 
customcr<oniext data for earlier versions 
of the system, information about the 
classes of candidate components available 
in the repository for the new system, and 
any feedback from previous projects with 
experience with different models for the 
types of modifications required. 

• Set up goals and refine them into 
quantifiable questions and metrics using 
the goal/question /metric paradigm to 
get performance that has improved com- 
pared to previous projects. This consists of 
a top-down analysis of goals that iteratively 
decomposes high-level goals into detailed 
subgoais. The iteration terminates with 
subgoais that you can measure directly. 

• Choose and tailor the appropriate 
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construction model for this project and 
the supporting methods and tools to sat- 
isfr the project goals. Understanding the 
environment quantitaiivelv lets you 
choose the appropriate process model 
and fine-tune the methods and tools 
needed to be most effective. For example, 
knowing the effect of earlier applications 
of the maintenance models and methods 
in creating new projects from old systems 
lets vou choose and fme-tune the appro- 
priate process model and methods that 
have been most effective in creating new 
svstems of the type required from older 
versions and component parts in the re- 
posiiorv. 

In analysis, vou evaluate the current 
practices, determine problems, record 
the findings, and make recommendations 
for improvement. You must conduct data 
anaivsis during and after the project. The 
goal /question /metric paradigm lets vou 
trace from goals to metrics and back, 
which letsvou interpret the measurement 
in context to ensure a focused, simpler 
anaivsis. The goal-driven operational 
measures provide a framework for the 
kind of anaivsis you need. 

In learning and feedback, vou organize 
and encode the quanuuttvc and qualita- 
tive experience gained from the current 
project into a corporate information base 
to help improve planning, development, 
and assessment for future projects. You 
can feed the results of the anaivsis and in- 
terpretation phase back to the organiza- 
tion to change how it does business based 
on explicitly determined successes and 
failures. 

In this wav, you can learn how to im- 
prove quality and productivity and how to 
improve goal definition and assessmenL 
You can start the next project with the ex- 
perience gained from this and previous 
projects. For example, understanding the 
problems associated with each new ver- 
sion of a system provides insights into the 
need for redesign and redcvelopmcnu 

Reuse-oriented environment. Reuse 
can be more effectively achieved in an en- 
vironment that supports reuse. (See the 
article by Ted Biggerstaff and Charles 
Richter^ for a set of reusability tech- 
nologies and the article bv mvself and Die- 
ter Rombach* for a set of environment 
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Goal/question/metric paradigm 

Tha goa^question/metnc paradigm represents a systematic ap- How many requirements are there tor the new system? What is 
proach for setting pro|ea goais (tailored to the needs of an organi- the mapping of the requirements to system components in the old 

ration! and defining them m an operational, tractaWe way. Goals system? How .ndependent are the components to be modified in 


Tha goa^question/metnc paradigm represents a systematic ap- 
proach for setting pro|ea goals (tailored to the needs of an organi- 
zation) and defining them in an operational, tractable way. Goals 
are associated with a set of quantifiable questions and models that 
specify metrics and data for collectioa The tractability of this soft- 
ware-engineerjfig process supports the analysis of the ootlected 
H^ta and computed metrics in the appropriate context of the ques- 
tions, models and goals, feedback (by integrating constructive and 
analytic activities), and learning (by defining the appropriate syn- 
thesis procedure tor tower level into higher level pieces of experh 
enoe.) 

The goals are defined in terms of purpose (why the proiect is 
being analyzed), perspective (the models of interest and the point 
of view of the analysis), and the environment (the context of the 
project). When measuring a product or process, you ask questions 
in three general categories: 

• produa or process definition. 

• definition of the quality perspectives of interest, and 

•feedback. 

Product definition ir>cludes physical anrtoutes of the product, 
cost, changes and defects, and the context in which the product will 
be used. Process definition indudes a model of the process, an 
evaluation of conformance to the model. arxJ an assessment of the 
project-specrfic documents and experience with the application. 

Definition of the quality perspectives of interest indudes the qual- 
ity models used (such as reliability and user friendliness) and the 
interpretation of the data collected relaiive to the models. 

Feedback involves the return of information for improving the 
product and process based on the quality perspective of interest 

The folowing is an informal application of the goaVquestion/met- 
ric paradigm to a particular mantenance problem. The answers to 
some of the questions are obvious. The answers to others assume 
a Hatahasa of expenenos that management must estimate if it is not 
availabfe. 

Goals. The goal-definition phase has three pans: 

• Purpose: Analyze the new product requirements to determine 
the appropnate evolution model. 

• Perspective: Examine the cost of the current enhancement arto 
future evolution of the system from the organizabon s poim of view. 

• Environment: Along with the standard environmental factors, 
like resource and problem factors, you would Pke to pay speoal 
attention to the context dimensions in tha reuse framewrork. 

In the requirements domain, you typcaHy use product objects 
from the same application domain, although you can choose ob- 
jects from other domains in the repository, if they are generaliy 
applicable. 

The solution domaki defines the process models, methods. arxJ 
tools used in the development of the old product If you plan to use 
the same processes for the evolving product there is no problem 
with reuse. If future evolution dictates changes to the solution do- 
main, the full-reuse model lets you make these changes, but at the 
cost of reusing less of the old product 

For knowledge-transfer mechanism, you must determine what 
form of documentation is needed to transfer the required appika- 
tiort process, and product knowledge to the mamtainers. If the 
maintenanoe group is the same as the devei(^)ment group, the 
major transfer mechanism is the people. 

Product deflnftkm. With the goal defined, you then define your 
product In this example, there are several products: the new prod- 
uct to be built (the newverstoo ofthe system), the old versions, and 
any other relevant objects in the repository that may be reused. 

For the category of physical annbutes. sample <^estions are: 


the old system? What is the complexity of the old system and its 
individual components? What candidate objects are available in 
the repository and what are their object, context, and transforma- 
tion classifications? How many new requirements, categorized 
by dass (such as size, type, and whether it is a new. modified, or 
deleted requirement) are there that are not in the old system? 
How many components, categonzed by class (such as size and 
type of change) in the old system must be changed, added, and 
deleted? 

For the category of changes and defects, sample questions are: 
How many errors, faults, and failures (categorized by dass) are 
there associated with the requirements and components that need 
to be changed? What is the profile of past and future changes to the 
system, categorized by dass (such as cost and number of times a 
component has been and must be changed)? 

For the category of cost sample questions are: What was the 
cost of the onginai system? What was the cost of each poor ver- 
sion? What is the cost of each prior requirement change by dass? 
What is the estimated cost of modifying the old system to meet the 
new requirements? What is the estimated cost of building a new 
system, reusing the experience and pans of the old system and the 
repository? 

For the category of customer context, sample questions are: 
What are the vanous customer classes and how are they using the 
system? What are the estimated future enhancements based on 
your analysts of customer profiles, past modifications, and the state 
of technology in the application domain? 

Quality parspacthre of Interest With the product defined, you 
now define the perspectives for the qualities that you are trying to 
achieve. 

You should make a model of the system’s evolution, along with its 
associated costs. Based on the data from the evolution of this sys- 
tem and other systems, as well as on the charactenstics of the set 
of new requirements, the model should let you estimate the cost 
and benefits associated with each of the three process models and 
lei you choose the appropriate one. Parameters tor the model will 
include such items as the projected system WeBme. the number of 
future related systems, and the proiected cost of changes for vari- 
ous classes of requirements. 

FeedbeciL With the quality perspectives defined, you can now 
get the mtormatton needed to improve the product or process. The 
feedback should provide with deeper Insights into the model and 
our environment 

Sample questions include: Is the model appropnate? How can 
the model be improved? How can the dassificaflons be improved? 

Other goals. There are many relevant goals. Consider the fol- 
lowing exarr^ales: 

• Evaluate the modification activities in the reuse model to im- 
prove them. Examine the cost and correctness of the resuiting ob- 
jects from the customer’s point of view. 

• Evaluate the components of the existing system to determine 
whether to reuse them. Examine theirirtoependerKte and funoionai 
appropriateness from the viewpoint of reuse in future systems. 

• Preoict the abiity of a set of code oomponents to be integrated 
into the cunent system from the developer's point of view. 

• Encourage the reuse of a set of repository components built for 
reuse. Examine the reward structure from the manager's and 
developer s points of view. 
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chaiacteristics.) Softwarc*«ngineering en- 
vironments provide such things as a proj- 
ect databases and support the interaction 
of people with methods, tools, and project 
data. However, experience is not con- 
trolled by the project database nor owned 
by the organizadon — so reuse exists only 
implicitly. 

For effective reuse, you need to be able 
to incorporate the reuse process model in 
the context of dcsciopmcnt. You need to 
combine the development and mainte- 
nance models to maximize the context di- 
mensions. You need to integrate charac- 
terization. evaluation, prediction, and 
motivation into the process. You need to 
support learning and feedback to make 
reuse viable. I propose that the reuse 
model can exist in the context of the im- 
provement paradigm, making it possible 
CO support all these requirements. 

Automated support. The improvement 
paradigm and the reuse-oriented process 
model require automated support for the 
database, encoded experience, and the 
repository of previous projects and reus- 
able components. A special issue of IEEE 
St^tware'* offered a set of automated and 
automatable technologies for reuse. You 
need to automate as much of the mea- 
surement process as po.ssible and to pro- 
vide a tool environment for managers and 
engineers to develop project-specific 
goals and generate operational defini- 
tions based on these goals chat specify the 
metrics needed for evaluation. This evalu- 
ation axtd feedback cannot be done in rc2U 
time without automated supporL 

Furthermore, automated support will 
help in the postmortem analysis. For ex- 
ample. a system like Tailoring a Measure- 
ment Environment.^ whose goal is to in- 
stantiate and integrate the improvement 
and goal/question/metrtc paradigms 
and help tailor the development process, 
can help support the rcuseoricnied pro- 
cess model because it contains mecha- 
nisms to support systematic learning and 
reuse. 

Applying the TAME concept to mainte- 
nance provides a mechanism for choos- 
ing the appropnaie maintenance process 
model for a particular project and pro- 
vides data to help you learn how to do a 
better job of maintenance. 

January 1990 Copyright 


T he approach you take to mainte- 
nance depends on the nature of 
the problem and the size and com- 
plexity of the modification. Viewing 
maintenance as a reuse-oriented pro- 
cess in the context of the improvement 
paradigm gives you a choice of mainte- 
nance models and a measurement 
framework. You can evaluate the 
strengths and weaknesses of the differ- 
ent maintenance approaches. learn how 
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The technicsl paper included in this section was originally 
prepared as indicated below. 

• "Evolution Towards Specifications Environment: 
Experiences With Syntax Editors," M. Zelkowitz, 
Information and Software Technology , April 1990 
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Evolution towards specifications 
environment: experiences with syntax editors 

M V Zelkowitz 


Language-based editors have been thoroughiy studied over the 
last JO years and have been found to be less effective than orig- 
inally thought. The paper reviews some relevant aspects of such 
editors, describes experiences with one such editor (Support), 
and then describes two current projects that extend the syntax- 
editing paradigm to the specifications and design phases of the 
software life-cycle, 

software design, environments, specification, syntax editors 


SYNTAX EDITORS 

Syntax-editing (or alternatively language-based editing) 
is a technique that had its beginning about 20 years ago 
(c.g., Emily') and blossomed into a major research 
activity 10 years later (e.g., Mentor^ CPS^)., During the 
mid-1980s, major conferences were often dominated by 
syntax-editing techniques*' ^ Many of these projects, 
however, have since been terminated or have taken a 
much lower profile. There are few widely used commer- 
cial products that use this technology. Why? 

This paper briefly introduces the concept of syntax 
editing, describes one particular editor, and explains 
some experiences in using it. It is then shown how the 
syntax-editing paradigm is powerful but perhaps misap- 
plied in the domain of source-program generation. 

Just using a syntax editor for source-code production 
does not result in significantly higher productivity. By 
integrating specification generation with this source-code 
production, however, the author believes that increased 
productivity can be provided by making more of the life- 
cycle visible to the programmer. Two extensions to the 
current environment are described that apply syntax 
editing within a specifications environment to provide 
additional functionality over that of standard syntax 
editors. 

With a conventional editor, the user may insert an 
arbitrary string of characters at any point in a file, and a 
later compilation phase will determine if there are any 
errors. With a syntax editor, however, only those choices 
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permitted by the language grammar can be insened, and 
the generation of source program and the processing of 
the program’s syntax are intertwined operations. For 
example, for the statement nonterminal <stmt>, there 
are only a limited set of statement types that are permit- 
ted and only those legal strings can be entered by the user 
in response to that nonterminal on the screen. 

The user interface is a major component of syntax 
editors. Depending on editor design, syntactic constructs 
can be specified via a mouse and pull-down menus, func- 
tion keys on the keyboard, or special editing prompt 
commands. If the cursor is pointing to the <stmt> 
syntactic unit and the user specifies the if statement, then 
the text 

if <expr> then 
< stmt > 
else <stmt> 

will replace <stmt> on the screen. Each nonterminal 
<...> is considered as a single editing character and 
syntactic constructs must be added or deleted in their 
entirety. In essence, the programmer is building the 
source-program parse tree in a top-down manner. 

Pure syntax -editing is a simple macro-like substitu- 
tion, and such macro substitutions exist in several con- 
ventional editors. For example, Emacs and Digital's 
LBE (Language Based Editor) both permit such substi- 
tutions anywhere in a program. Here, however, editors 
that go beyond simple substitution are being considered. 
Screen layout is often specified (e.g. unparsing the pro- 
gram tree to a ‘pretty-printed' display), semantic infor- 
mation is usually checked (e.g., variable declarations, 
mixed types), and often the editor is part of an integrated 
package or environment of editor, interpreter, and 
debugging and testing tools. 

Early on, many advantages of a syntax editor were 
stated: 

• Source-program generation would be efficient as a 
single mouse or function key dick would generate an 
entire construct. 

• Productivity would increase as numerous errors such 
as missing begin — end pairs could not occur and mixed 
mode expressions would immediately be found by the 
editor at the point of insertion. Users could more 
easily use an unfamiliar language. 

• Screen layout would be predefined, providing a 
uniform structure to all programs. 
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• The integrated package of tools enables testing and 
debugging to proceed more rapidly. 

As shall be seen, the last of these reasons does indeed 
seem to be true: each of the others, however, seems to 
have a serious drawback as well as the supposed benefit. 

As an example, the Support environment, designed by 
the author, is briefly described as an instance of the 
integrated syntax-editing genre^. It has many of the 
features implemented in such tools and is the basis for 
the extensions to specifications described later. 

Support design 

Support is an integrated environment built to process the 
CF-Pascal subset of pascal and was used for three 
years (until the course contents changed) as the program- 
ming tool in the introductory programming course at the 
University of Maryland. It runs on both Berkeley Unix 
and IBM PC systems. 

Design 

Major features of Support include the following. 

Text input Support uses both the command and func- 
tion key mechanisms for input. If the cursor (represented 
by reverse video) covers the < stmt> unit, a menu at the 
bottom of the screen gives the available choices. For 
example, to insert an if statement, either a response of .2 
or depressing function key 2 (on the PC keyboard) will 
insert the if construct. 

Support also permits textual substitution for any syn- 
tactic unit. A user can type in an arbitrary line of char- 
acters, and an internal LALR parser builds the subtree 
for that construct. If the root of that subtree is permitted 
by the current cursor position, then it is attached to the 
program tree at that cursor position. 

Using either input mechanisms, invalid syntax can 
never be entered. Using the menu for input permits only 
correct responses, and, for textual input, if the parser 
cannot resolve the typed-in text to a correct syntactic 
unit, an error is displayed and the program is not modi- 
fied. 

Windows Horizontal windows dividing the CRT screen 
are the major interface with the user. Each tool within 
Support controls its own window, and from two to four 
windows will typically be displayed at any one time. 
Tools Various tools within Support aid in program 
design and development. The relationship among pro- 
cedures in a program is handled by the Design window; 
an interpreter executes partially developed programs and 
includes features such as variable and statement tracing 
and breakdown monitoring. Statement trace and state- 
ment coverage windows are part of this structure. Data 
arc displayed via the variable trace and the run-time 
display windows. 

As an extension to the textual input mode, a small (i.e.. 
size of screen) text editor called the Character Oriented 
EDitor (or COED) was implemented. Users insert or 
modify arbitrary sequences of characters in this window, 
have the text processed by the LALR parser mentioned 


Table 1. Background of students 



Semester 1 

Semester 2 

First university computer course (%) 

73 

82 

Took this course previously (%) 

12 

9 

Took high-school course (%) 

59 

55 

Never previously used computer (%) 

26 

24 

Own microcomputer (%) 

49 

51 


above, and then have the text inserted into the program 
tree at the appropriate place in the program. The user 
can also pull an arbitrary section of program text into 
this window for modification. This also gave an easy cut- 
and-paste feature and the ability to move sections of 
code around in the program as a means to address some 
of the syntax-editing deficiencies that turned up. 
iMnguage and screen displays The grammar processed 
by Support (e.g„ CF-PASCAL) is defined via an external 
data file that defines the syntax, some semantics, and 
screen layout. This feature turned out to be a major 
factor in allowing Support to be extended for other 
applications. 

Experiences 

Support was used from 1986 until 1989 in Computer 
Science I by approximately 200 to 300 students each 
semester. During the first two semesters data were 
collected from the 543 students that enrolled in the 
course. The background of the students is summarized in 
Table 1. As shown, about 75% had previous experience 
with programming and about half own their own 
computer. 

Based on a I to 5 rating scale (I =poor), students who 
owned their own computer (and presumably had more 
experience in programming) rated satisfaction with Sup- 
port lower than those without their own computer (2.8 to 
3.2). More revealing, students rated Support s text-edit- 
ing capabilities much lower than those of an IBM main- 
frame also used during the semester (2,7 versus 3.7 for 
one semester, 3.3 versus 3.8 for the other). The author 
believes that users with experience with general text 
editors felt more restricted by the syntax-editing para- 
digm. On the other hand, novices with no previous 
experience fell aided by such restrictions. 

Students using Support rates its debugging capabilities 
higher than those available on the IBM mainframe (3.8 
versus 3.1 for one semester, 3.0 versus 2.9 for the other). 
The PC system was also rated as more available com- 
pared with the mainframe (3.9 versus 2.8 lor one semes- 
ter, 3.0 versus 2.9 for the other). Other results are pre- 
sented elsewhere’. 

In summary, syntax editing seems to be viewed as a 
restriction on program development, but the integrated 
development and testing environment appears to be 
desired. A tool that simply develops source text does not 
seem to produce a large productivity increase. The 
results here are comparable to those found with other 
editing environments. 
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Retrospective 

After several years of use and several redesigns and 
enhancements based on user needs and experiences* the 
four advantages claimed for such editors can be 
addressed more clearly. As shall be seen, for most of the 
advantages* there arc some serious problems to over- 
come. 

Efficient generation of source programs 

For entering much of the text of a program* this is true, 
but unfortunately there are enough complications to 
slow down experienced programmers. For example, the 
PASCAL if statement has an optional else clause. Should 
the editor automatically insert the else and have the 
programmer delete it if not desired, or should it not be 
included with the corresponding need to add it if wanted? 
Support chose the latter model, but in either case the 
editor will be wrong about half of the time. 

In Support's case* the screen displays no information 
about optional syntactic units, so the user needs to know 
where such units arc located. There arc two modes of 
moving forward through a program: the key moves to 
the next syntactic unit displayed on the screen, while the 
enter key is similar but will insert any optional phrases 
between displayed syntactic units as it moves. In POE's 
casc^ the opposite occurs. All optional units arc dis- 
played initially* and the user must delete them if not 
specifically wanted. 

A more serious consequence is that syntactic units arc 
added top-down, but programmers usually think of 
algorithms as sequential actions. For adding new state- 
ments, there is not much difference between sequential 
insertion and top-down development of the BNF: 

< stmt list > <stmt>; < stmt list > ! 

< stmt > 

as both generate statements in a left-to-right manner. 
Insertion of expressions such as A + B*C, however, 
essentially means to build the tree in postfix order (e.g., 
" + "A". ***", “B"* “C"), which is not the natural 

sequence. 

In some environments, such as CMU's Gandalf’. this 
top-down linking to the program's parse tree is embed- 
ded in the user interface: in Support’s case, however, the 
LALR parser mentioned earlier was added. Straight text 
will be parsed and entered in its true infix format. The 
COED editor within Support was a valuable extension 
that permitted programmers to add small sections of 
program text (up to 22 lines of input) without violating 
the basic top-down nature of program generation in a 
syntax editor. 

Early detection of syntax and semantic errors 
While true* this is not much of a benefit if its conse- 
quences are considered. Experienced programmers 
generally do not make many syntax errors as they enter 
text, although novices do. (This might explain Support's 
greater popularity among non-programmers than among 
programmers.) 


There are cases where this supposed benefit is actually 
a hindrance. If an experienced programmer thinks of a 
sequence of code to enter and makes an error in input, a 
standard editor will ignore the error and continue enter- 
ing data. After finishing entering code, the programmer 
can fix the earlier problem. With a syntax editor* how- 
ever, only correct syntax can be entered. The system will 
usually halt and beep until corrective action is taken. 
Thus there is a disruption in a train of thought where 
some deep semantic issue needs to be put aside (and 
forgotten?) to fix some simple syntax. 

Looking at both of these reasons, as languages get 
more complex (e.g.* ADA) syntax editing might make 
more sense, but in relatively simple languages, like PAS- 
CAL and C, there seems to be few benefits. There is little 
experience with such editors for complex languages. Arc- 
turus'® is a prototype of an ADA editor, but it was not 
made commercially available. 

Screen layout is predefined 

This is also true, but again the predefined layout might 
not be what the programmer wants in all cases. It cer- 
tainly helps the novice generate nicely indented listings, 
but as the programming task grows more complex, the 
number of special cases increases. 

The placement of comments seems to pose a problem 
with all such editors. Comments are generally outside the 
language’s defining BNF. Where do they appear in the 
listing? In Suppon they are tagged before the defining 
nonterminal. This works in some cases* but not all. 

Uitiform debugging and testing tools 

This again is true, but a syntax editor is not needed for 
this feature. An integrated framework and data reposit- 
ory are needed for a source program. The current interest 
in CASE (computer-aided software engineering) tools 
exemplifies this, and Support is simply a CASE tool with 
a syntax editor for a base. 

In summary* the experiences with Support are by no 
means unique and closely mimic expcnences others have 
had with syntax editors. For example. Mentor, initially 
developed about eight years earlier at INRIA, has had a 
similar pattern of development and use**. Similar to 
experiences with Support: 

• Novices used menus but experienced programmers 
rarely did. 

• Experienced programmers wanted the full-screen 
Emacs editor for textual input and modification (pro- 
viding functionality similar to the COED editor des- 
cribed here) using automatic parsing and unparsing of 
the Mentor input. 

• Switching between Mentor and Emacs was difficult 
due to the inherent problems in placement of 
comments. On the other hand. Mentor was a powerful 
source-code maintenance system due to the integ- 
ration of many program analysis tools for obtaining 
semantic information about a program. But just as in 
Support’s case, such tools are mostly a function of 
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Mentor being an integrated environment and not 
simply an editor. 

In conclusion, the drawbacks seem to be as serious as the 
advantage in syntax editing, which probably explains 
their lack of growth and popularity since the early ’80s. 
As a final comment, source-code development is often 
stated as 15% of total life-cycle costs. Even if the editor 
reduced coding time to zero, that would still mean a 
productivity improvement of only 1 5%. Industry is look- 
ing for more than that. 

SPECIFICATIONS 

The previous discussion indicates that while syntax edit- 
ing of source programs is a powerful technique, it proba- 
bly has minimal elfect on programmer productivity. As 
requirements, specification and coding take up to 75% of 
the costs to develop a system, however, improving those 
phases of the life-cycle might have a more dramatic 
impact on productivity, in addition, a mechanism to 
improve the flow between specifications to design to code 
would probably lead to fewer interface errors, hence 
decreasing the effort needed in testing and further 
increasing improved productivity. 

For coding source programs, there are several pro- 
gramming techniques: procedural languages (e.g., PAS- 
CAL, C, ADA, COBOL), applicative languages (e.g., LISP, 
PROLOG), object-oriented programming (e.g., SMALL- 
TALK, C++), etc. Their relative strengths and weak- 
nesses for specific applications are fairly well established. 
For specification of a program, there are also several 
models (e.g., axiomatic, denotational, algebraic, func- 
tional); however, as yet there is no clear consensus as to 
which is most effective and how each applies to different 
application domains. This is still very much an open 
research question, with many ongoing projects studying 
vanous specification strategies. 

Given the powerful syntax editing paradigm and its 
relative inability at improving source-code generation, 
the author decided to investigate it within a specification 
domain. After all, most specification languages have a 
syntax and semantics more complex than most program- 
ming lansuages, and some anecdotal data do seem to 
indicate that programmers would prefer syntax editors 
for sutficiently complex languages. 

As stated previously, Support processes a language 
defined by an external grammar file, and it is constructed 
as a set of independent tools, each writing to virtual 
windows that are mapped to the actual computer screen. 
By modifying this grammar and by adding new support 
cools. Support becomes an interface 'shell’ lor a senes of 
integrated environments. It can be used as a language 
processing meta-environment by providing the capabili- 
ties to read input, parse text, build parse trees, and 
manipulate multiple windows simultaneously. Using 
Support, two such extensions were developed that are 
described here: AS^ (based on algebraic specifications) 
and FSQ (based on functional specifications). 


(1) sort seqvcnce [sort somethtngj U 

(2) constructor 

(5) epsilon; 

(4) cons : something, sequence; 

(5 } operation head : sequence something is axiom 

(6) head(epsihn) ?; 

(7) head(cons(X,Y)) X; 

(8) operation coun^ ; sequence integer is axiom 

(9) count(epsilon) 0; 

(10) count(cons(X,Y)) =»— l’hcount(Y); 

(11) end; 

Figure I . Example of sequence specification 

AS* for executable specifications 

An algebraic specification is a series of axioms that link 
together the operations that can be applied to an abstract 
data type. As an extension to the Support environment, a 
specifications extension based on these algebraic axioms 
has been defined. 

An AS* specification contains three features: 

• a set of sort names that define new abstract objects and 
their constructors 

• a signature, which defines a set of defined operations 
for manipulating the abstract objects 

• a set of oriented equations (or axioms) that relate the 
defined operations and constructors to each other 

Figure 1 gives a simple example of a specification lor a 
sequence. Line ( I ) specifies that a class of objects of sort 
<i.e.. iyp>e) 'sequence' is being defined and indicates that 
the new object will require as a parameter a sort ’some- 
thing' that will be specified in a later binding. A generic 
class of sequences that will be instantiated by this later 
binding to 'something* is being defined. Lines (2) — (4) 
define the two constructors needed to create an object ot 
this sort: ‘epsilon’ to return the empty object of sort 
'sequence and 'cons’, which takes an element and a 
sequence and returns a new sequence with the element in 
it. The functionality of each constructor is given after Us 
name with the sort name sequence implied as last (e.g., 
epsilon* returns an empty sequence' and cons' requires 
a something’ and a 'sequence' and returns a sequence .) 
Epsilon’ initializes objects of this sort and 'cons' creates 
new complex objects. 

This object is manipulated by means of a set of defined 
operations. In this simple example, operations head* and 
count' are given with their signatures on lines (5) and 
(8). They are defined by the rewrite rules (a.xioms) on 
lines (6) — (10). Head’ says to return the element last 
included into the sequence by the ‘cons' function, while 
count’ returns 0 for 'epsilon' (i.e., an empty list) or I plus 
the size of any non-null list with the first element 
removed. As can be seen, the formal definitions of each 
function includes recursive algonthms for computing its 
value by reducing any complex object to a finite set ot 
applications of the constructor functions. The on line 
1 6 ) is equivalent to an error condition, and the implemen- 
tation stops execution and issues an error message when 
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compiles 


this occurs. (That is» it is illegal to take the head’ of an 
empty list.) 

For example, the list <X,Y,Z> is created by the 
construction: 

cons(X,cons(Y.cons(Z,epsilon))) 



As/PCK ^ 

translates / \ 

e Q 

PASCAL source Executable 
file file 


and the operation ‘count’ uses this construction, as in: 

count( < X, Y,Z > ) = 

1 -t“Couni(< Y,Z>)* 

1 + 14* count( < Z > ) “ 

I 4* I + H- count! < epsilon > ) = 
l+l+l+O* 

3 

The use of the Knuth — Bendix algorithm^- defines a 
proof of adequacy of the resulting algebraic equations by 
showing the equivalence of supposedly equal terms to the 
same ground (i.e., constant) terms. As the Knuth — Ben- 
dix algorithm is based on an ordering transformation 
from one term to a ‘simpler’ term, however, the algor- 
ithm defines an operation that can be executed’ and 
proven to terminate. Therefore, any set of axioms that is 
‘Knuth — Bendix* can be transformed mechanically into 
a series of transformations that can be executed in some 
programming language, in this case pascal. 

Similar to Larch and Larch/CLU'^ AS* specifications 
are independent of the underlying programming lan- 
guage and must be defined relative to any concrete lan- 
guage. Libraries of generic specifications can be used to 
form the basis of a reuse methodology where the generic 
specification is refined to an explicit specification in a 
specific programming language by binding the generic 
sorts to specific programming language types. In this 
case PASCAL is considered as the implementation vehicle, 
so to create ASPascal, the extension to PASCAL that 
contains AS* specifications, a link between a pascal 
object and an AS* son must be indicated. 

An explicit specification is created by a refinement of a 
genenc specification via the use clause, as in: 

sort intsequence is 

use sequence [integer] 

end: 

which refines the generic sort .‘sequence* given earlier and 
indicates that a new son ‘intsequence* is created by 
modifying sequence* with a binding of pascal integers 
to the free sort ‘something’ of Figure I . The operations 
‘head’ and ‘count’ in ‘sequence’ become intsequence- 
head’ and inisequencc-count’ in the new sort, although 
the actual mapping to their new names is handled auto- 
matically and of no concern to the programmer. 

The interface assumption is made that an explicit sort 
specification 

sort newsort is . . . 

is equivalent to the PASCAL type declaration 


Figure 2. AS* toolset 
type newsort = . . . 

The primitive PASCAL scalar types (char. Boolean, 
integer, real) may all be used in abstract sort definitions, 
and any explicit sort may also be used in a refinement. 
Thus 

var A: intsequence; 

simply creates a PASCAL variable A, which is of type 
‘intsequence’. 

The power of this system is in alternative bindings. For 
example, real sequences could be created as 

sort realsequence is use sequence [real] end: 

Similarly, a sort such as a ‘book’ could be used to create 
a type library' as 

sort library is use sequence [book] end: 

As stated earlier, syntax editors might have greater use 
with more complex source languages, and the integrated 
tool set forms an effective basis for a CASE tool. There- 
fore, a prototype AS* system was built on top of the 
existing Support environment. Figure 2 represents this 
initial system that has been constructed. The four com- 
ponents are as follows. 

AS/Support 

AS/Support is a modification to the Support environ- 
ment described earlier, which provides text-editing capa- 
bilities for creating specifications. It is also the control 
module that invokes the verification tool, AS/Support 
first checks axioms within operations for syntactic 
consistency. Because of the language-based design of the 
underlying environment, only syntactically correct 
axioms with the syntax 

opera tion_name( < expression-list > ) = = < expres- 
sion > 

can be entered by the user. After the user builds a sort, 
AS/Support formats the sort syntax into an appropriate 
format suitable for PROLOG and invokes AS/Verifier as a 
subprocess. AS/Verifier reads these axioms and checks 
execuiability. After passing all executability checks 
through AS/Verifier, the user may save the ASPascal 
program in a library for later translation by AS/PC or 
for later incorporation into another ASPascal program. 
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In case of failure, the causing axiom, if it can be deter- 
mined, is highlighted to allow the user an interactive 
mechanism to change the specifications. 

AS/Verifier 

AS/Verifier, a PROLOG program, is called by AS/Support 
and verifies the set of axioms via the Knuth — Bendix 
algorithm. In general the axioms need to be a noetherian 
term rewriting system, and, if possible, AS/Verifier 
makes this determination. Of course, as the general 
problem is undecidablc, in some cases the results are 
inconclusive. In any case, after one pass through the 
axioms, AS/Verifier will either succeed or indicate which 
axiom is currently failing so that the user may modify the 
definition and try again. As stated previously, if any 
error is found, an appropriate message is relayed back to 
AS/Support and displayed to the user. 

For example, the ’sequence' definition of Figure I will 
be converted to the following clauses and passed to AS/ 
Verifier: 

as <- sort (sequence, [epsilon, cons, head, count]). 

function (I, epsilon, [], sequence). 

function (2, cons, [something, sequence], sequence). 

function (3, head, [sequence], something). 

function (4, count, [sequence], integer). 

axiom (5, head (epsilon), "‘?”). 

axiom (6, head (cons(x,y)),x). 

axiom (7. count (epsilon),0). 

axiom (8. count (cons(x,y)), 1 +count(y)). 

(as ^ sort is the internal name for a new sort’.) The 
Knuth — Bendix algorithm either shows convergence of 
the axioms or indicates additional axioms that are 
needed; it may not indicate, however, when sufficient 
axioms have been added in the case of not converging 
rapidly enough (the usual problem with undccidability 
results). In this case, AS/Verifier does a single pass over 
the axioms and then terminates, indicating where the 
problem is with the axioms. 

AS/PC 

AS/PC is the translator, written in YACC, that converts 
specifications into standard PASCAL source programs. 
The code generally consists of a sequence of if state- 
ments, each checking the validity of the left-hand side of 
the axiom before executing the Knuth — Bendix reduc- 
tion. 

PC 

PC is the standard system PASCAL compiler. At this 
point, the specifications have been converted to standard 
PASCAL, and any comparable compiler can be used for 
compilation and execution. 

Specifications appear in programs as function calls in the 
host programming language. To translate such calls, it is 
necessary to determine, for each function reference, 
which explicit specification is being used. Thus a refer- 
ence to ‘headC thing)’ where ’thing’ is an ‘intsequence’ is 
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translated to a call to ‘intsequenceL-head(thing)’, while 
'head(realthing)’ will result in *rcalsequence_head(real- 
thing)’ for variable ‘realthing’ of sort ’realsequence’. 
(The details of the AS* implementation appear else- 
where**.) 

It should be clear that this translation does not result 
in a particularly efficient implementation; as a specifica- 
tions or prototyping tool, however, efficiency is not its 
overriding purpose. The goal is to provide easily a cor- 
rect extension to an existing system and to provide a 
verification tool, e.g., an oracle, that can be used as a test 
against an eventual efficient solution to the problem. 

FSQ for software reuse 

In the previous section, AS* was described as an environ- 
ment based on an algebraic specification model for pro- 
gram specifications. Support is also being applied using 
the functional correctness model'^. In this model, both a 
program and a specification are viewed as functions, and 
techniques have been developed to determine if both 
represent the same transformation of the data. This 
model of program development is briefly summarized 
and how Support is modified to aid in this process is then 
demonstrated. 

Functional correctness 

A specification /is a function. A box notation [...] is used 
to signify the function that a given string ot text 
implements. If character string a represents a source 
program that implements exactly/, then [a] /, and it is 

stated that a is a solution to /. 

Sequential program execution is modelled by function 
composition. If a sequence of statements s = 
then [s] = [s,] o ... o [sj = [sj ( ... )[s,])) ... ). Using 
techniques from denotational semantics, each statement 
s is a function from a program state to another state. 
Each program state is a function from variables to values 
and represents the abstract notion of data storage. Sym- 
bolic trace tables are \ise9 to derive the state functions 
for if, while, and assignment statements. 

Program design is accomplished by converting a speci- 
fication function /, written in a LlSP-like notation, into a 
source program a, and then showing that [a] — j. The 
specification / is called the abstract function and the 
program x the concrete design. Given this functional 
model, the basic theorem for functional correctness** can 
then be proved. Program p is correct with respect to 
specification function j if and only if/<^ [pj- 

This model can be applied to three separate activities: 

• Program verification. If / is a function and if p is a 
program, determine if they are the same function, i.e.. 
[p] “ / or ^ore generally/ C (p)- 

• Program design. If / is a function, then develop a 
program p such that [p] = / 

• Reverse engineering. If p is a program, then find a 
function /such that [p] = / 
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ere is the iieaning For the segment 
'X* < 'y' ;b:*'y' ;c:»*x' ; 

not<^x* < *y') ->a:»'x';b:»'y' jc:»'y'; 


a 


b :» »y*.. 
i f a < b the. . 


|T->a:»'x' 

T->b:»'y' 
a < b »>c:«a; 
not (a < b) ->c:«b} 



end. 

Figure J. FSQ derived meaning for program fragment 




FSQ extensions 

The use of existing program fragments when developing 
a new program is one technique being studied for 
improving programmer productivity. Often, however, it 
is first necessary to determine exactly what these pro* 
gram fragments or procedures do. As formal specifica- 
tions are rarely used, and documentation is generally 
quite inadequate, programmers are reluctant to use an 
existing procedure written by another from, some pre- 
vious project since the mental effort to truly understand 
that procedure is quite high. 

To study this problem, the Support environment was 
extended with a new tool. Function Specification Quali- 
fier (FSQ), to aid this process of determining the specifi- 
cations for an existing component of a system. FSQ- 1, a 
first prototype of this tool, is described. 

FSQ is an additional tool to the basic CF-pascal 
programming environment in Support and works as 
follows: 

• A programmer either builds^ program using Support 
(and hence uses FSQ as a verification tool) or else 
reads one from the file system using the LALR parser 
internal to Support to build the parse tree (making 
FSQ a reverse engineering tool). 

• The cursor is moved over the section of program that 
needs to be verified and FSQ is invoked via the com- 
mand fsq. 

• FSQ symbolically executes each statement and deter- 
mines its meaning. This is relayed back to the user, 
who cither accepts this meaning (e.g.. its specification) 
or manually simplifies it to another meaning. 

• The derived meaning is stored in the Support syntax 
tree. If any part of a program is symbolically executed 
and already has a derived meaning, then that meaning 
will be used without further analysis. This meaning 
can then be carried along as part of the file system 
repository information on that object. Future users of 
that object will not have to derive the meaning again. 


Over time, more and more procedures in the system 
repository will have such derived meanings, making it 
more efficient to reuse such objects frequently. 

Figure 3 shows a sample execution of FSQ. The top 
meaning window shows the desired result from the 
execution, the middle program trace window indicates 
each partial result, and the bottom window highlights the 
section of the source program that is under study. 

FSQ executes over the covered portion of Figure 3 as 
follows: 

• (1) For a: =* ’x' the system derives the conditional T 
a:= \x\ (This is similar to the LISP ‘cond’ and means 
‘True implies a; = '.r'.’) 

• (2) For b: == ’y' the system derives the conditional T 

b: = y. 

• (3) For c: =* a the system derives the conditional T 
c: * a. 

• (4) For c: = b the system derives the conditional T 

c:^b, 

• (5) For the if statement, FSQ combines steps (3) and 
(4) to produce: 

not (a<b) -• c: = b; 

(a<b)-* c:»a 

• (6) Finally, for the entire sequence, FSQ combines the 
results from steps ( I ) through (5) to produce the func- 
tion described in Figure 3. 

Note that this process is simpler than general program 
verification (and potentially less accurate) as the pro- 
grammer can override the system and insert arbitrary 
definitions. For example, in the program of Figure 3, the 
user, in the process of deriving the meaning of the if 
statement at step (5), could have either substituted the 
correct simplification 

c: = min (a,b) 
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or any other correct or incorrect expression for the if. 
Thus the user must trade off between ‘absolute’ but 
extremely difficult correctness using a verifier and a 
system like FSQ. which performs efficient, but possibly 
imperfect, verification. The tool is truly interactive, with 
FSQ performing all the tedious bookkeeping procedures, 
and by having the user required provide for the creative 
program derivation activities. This avoids the general 
undecidability issues of general verifiers and permits the 
data-intensive functional verification mechanism to be 
used practically. 

CONCLUSIONS 

In this paper the basic features of syntax-directed editors 
have been described and possible reasons why such 
editors have not become more popular outlined. The 
author believes that their benefits do not increase pro- 
ductivity sufficiently to compensate for their deficiencies. 
Source-code generation, although labour intensive, is not 
a major cost factor in system development. 

However, syntax editors can provide a consistent 
interface when system specification is integrated with 
source-code generation. To expenment with this, two 
specification projects have been described as extensions 
to an existing PASCAL development environment. In 
these extensions both algebraic specifications and func- 
tional correctness models of development were applied 
as extensions of automated tool support. Further work is 
needed to test the eventual applicability of this form of 
environment. 

ACKNOWLEDGEMENTS 

This work was partially supported by Air Force Office of 
Scientific Research grant 87-0130, Office of Naval 
Research grant N000!4^87-K-O307, and NASA grant 
NSG-5123, all to the University of Maryland. Indivi- 
duals who have contributed include: for Support: Bonnie 
Kowalchack, David Itkin, Jennifer Drapkin, Michael 
Maggio. and Laurence Herman; for AS*: Sergio Antoy 
(of Virginia Tech), Sergio Cardenas, Paola Forcheri and 
Maria Teresa Molfino (of I.M.A., Genoa, Italy), Stuart 
Pearlman. and Lifu Wu; and for FSQ: Victor Basil! and 
Sara Qian. 

REFERENCES 

1 Hansen, W J ‘User engineenng principles for interac- 
tive systems’ in Proc. Full Joint Contp. ConJ. Vol 39 
(I97l)pp 523-532 

2 Donzeaii^Gouge, V, Kahn, G, Huet, B, Lang, B and 
Levy, J ‘A structure assisted program editor: a first 


198 


step towards computer assisted programming' in 
Proc. Int. Computer Symp. North-Holland, Amster- 
dam, The Netherlands (1975) 

3 Teitlebaum, T and Reps, T CPS: the Cornell Program 
Synthesizer' Commun. AC\t Vol 24 No 9 (1981) pp 
563-573 

4 Proc. ACM SIGPLAN Symp. Language Issues in 
Programming Environments Seattle, WA, USA (June 
1985) 

5 Proc. ACM SIGSOFT Practical Sojhvare Develop- 
ment Environment Conf. Pittsburgh. PA, USA (April 
1984) 

6 Zelkowitz, M V ‘A small contribution to editing with 
a syntax directed editor in Proc. ACM SIGSOFT 
Practical Sofnvare Development Environment Conf. 
Pittsburgh, PA, USA (April 1984) pp 1-6 

7 Zelkowitz, M V, Kowalchack, B, Itkin, D and Her- 
man, L ’A support tool for teaching computer pro- 
gramming' in Fairley, R and Freeman, P (eds) Issues in 
software engineering education Springer-Verlag, Ber- 
lin. FRG(I989) pp 139-167 

8 Fischer, C, Pal, A, Stock, D, Johnson, G and Maunev, 
J ‘The POE language-based editor project' in Proc. 
ACM SIGSOFT Practical Software Development 
Environment Conf. Pittsburgh, PA. USA (April 1984) 
pp 21-29 

9 Habermann, N and Notkin, D ‘Gandalf. Software 
development environments' IEEE Trans. Soft. Eng. 
Vol 12 No 12 (December 1986) pp 1117-1127 

10 Standish, T and Taylor R. Arcturus: a prototype 
advanced Ada programming environment’ in Proc. 
.ACM SIGSOFT Practical Software Development 
Environment Conf. Pittsburgh, PA. USA (April 1984) 
pp 57-64 

1 1 Lang, B ‘On the usefulness of syntax directed editors' 
in Proc. IFIP Workshop on Advanced Programming 
Environments Trondheim. Norway (June 1986) pp 
45-51 

12 Knuth, D and Bendix, P Simple word problems m 
universal algebras' in Computational problems in 
abstract algebra Pergamon Press, New York. NY, 
USA (1970) pp 263-297 

13 Wing, J ‘Writing Larch interface specifications' ACM 
Trans. Prog. Lang. Syst. Vol 9 No 1 ( 1987) pp I -24 

14 Antoy, S, Forcheri, P, Molfino, T and Zelkowitz, M 
Rapid prototyping of system enhancements' in Proc. 
1st Int. Conf System Integration (April 1990) 

15 Gannon, J D, Hamlet, R G and Mills, H D Theory of 
modules’ IEEE Trans. Soft. Eng. Vol 13 No 7 (July 
1987) pp 820-829 

16 Mills, H D, Basili, V R, Gannon, J D and Hamlet, R G 

Principles of computer programming: a mathematical 
approach Allyn Bacon (1987) 


information and software technology 


4-9 


6109 















SECTION 5 - ADA TECHMOLOGY STUDIES 


The technical papers included in this section were originally 
prepared as indicated below. 

• "On Designing Parametrized Systems Using Ada," 

M. Stark, Proceedings of the Seventh Washington Ada 
Symposium . June 1990 

• "PUC: A Functional Specification Language for 

Ada," P. Straub and M. Zelkowitz, Proceedings of 
the Tenth International Conference of the Chilean 
Computer Science Society , July 1990 

• "Software Reclamation: Improving Post-Development 

Reusability," J. Bailey and V. Basili, Proceedings 
of the Eighth Annual National Conference on Ada 
Technology . March 1990 
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On Designing Parametrized Systems Using Ada 
Michaei Stark 

Goddard Space Fiight Center 


1. Introduction 

A jaouQBlzizalsystBm is « software system that can bo 
configured by seiecting genorafizod models and providing 
specific parameter values to fit those models Into a 
standardized design. This is in contrast to the top-down 
development approach where a system is designed first and 
software is reu^ only when it fits into the design. The term 
reconfipurabta is used interchangeably with parametrized 
throughout the paper. This concept Is particuiariy useful in a 
development enviionmant such as the Goddard Space Flight 
Center (GSFC) Right Dynamics Division (FDD), where 
successive systems have simitar characteristics. 

The FDD's Software Engineering Laboratory (S£L) has been 
exwnining reuse issues associated with Ada from the beginning 
of hs Ada research ki 1985. The lessons learned have been 
applied to operelionai Ada systems, Imfingto a nim medtote 
trend towards greater reuse than » typical for FORTRAN 
systems (MoGany 1969]. In addition, the Genaric.Simulator 
prototyping project (GENSIM) was a first effort at designing a 
paramotrized simulator system. The lessons teamed through 
the use of Ada and the GENSIM prototype are being applied to 
the Combinod Oparational Mission Ptanning and Attitude 
Support System (COMPASS), which is to be a reoonfigurabie 
system for a mu(^ larger portion of the flight dynamics domain. 
This paparwii dfoouss the lessons teemed from the GENSIM 
project, some cf the reconfiguration concepts planned for 
COMPASS, end wil define a model for the devsloprnent of 
reoonfigurabie systems. This rnodel provides techniques for 
realizing the potential for *Oomatn-Directed Reuse*, as defined 
by Breun and Prieto-Oiaz [Braun 1969]. 

The major motivs for reoonfiguraf>fo eystams in the POD Is cost 
reduction. Having a weMested set of reusable components may 
also increasa refiabSty and shorten development schedules, but 
cost is the primary fa^r in this environment Research done by 


the SEL indicates that verbatim software reuse (reuse without 
modification) can produce major cost savings. The cost of 
integrating a component that isreused verbatim is approximateiy 
10 per cant of the cost of developing a new oomponenC from 
scratch [Solomon 1967]. Analysis dona for GENSIM indicated 
thatapproximatsly 70 to 80 per cent of the code could be reused 
verbatim, and that this should cut simulator development costs in 
hair[Marfoey 1987]. 


2. Reoonfigurabie Systems 

This section Ibeuses on the approaches taken and lessons 
(earned from the GENSIM and COMPASS projects. These 
lessons inifoenood the reuse concepts and tachniques defined in 
the subsequent sections of the paper. 

2.1 GENSIM Overview 

The GENSIM project was started in late 1986. and divided into 
two major pha^. The first phase lasted until mid*1968, with 
the major products being the cost analysis cited above, 
mathemadcat specifications, and Ihe high level system design. 
From mid-1988 to mid-1969 a smali development team started 
implementing prototype software. The project was terminatad 
before the prototype system was completed and evaluated, as 
COMPASS incorporates simulation requirements into Its broader 
domain. Nonetheless, enough devefopment work was done to 
leam some useful lessons. 

The generic simulator design consists of a set of "modules* that 
plug into a standardized sirnulator architecture. Each of these 
modules was expected to have a oonesponding mathematical 
specification, design data (object diagrams and Ada package 
spedfications), and source code. The use of standardzed 
specifications was intandad to prevent the slight dfierenoes in 
specificalfons that often impede verbatim reuse. In addition, the 
GENSIM project intended to matntain test plans, data, and 
software for each module, so that changes in standard modules 
could be tested rapidty. 

The simulator architecture is based on the designs of the first 
two Ada sknulators developed in the FDD. The enhancements 
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and changas to iNs architactura were intended to allow different 
sets of modules to be configured into a system, dependng on 
the simulatioo requirements for a given satellite. It was possible 
to generalize the early designs, but because these were early 
designs, GENSIM incorporated some design flaws, even as 
others were removed. The major results erf GENSIM were 

1) The concept of reusing products from aU.fife cy^ 

phases presented no problems, and provided the anticipated 

benem of standardizing fTtethernaticalspecificalions. The 

GENSIM team thoroughly specifies the individual simulalor 

modules. However, the oonnections betereen modules were 

made at design time, despite the fart that they represented 
dependencies inherent in the problem. Note capturing these 
dependencies in the specrficalion was not a problem, since the 
GENSIM tesmhapperred to be knowledgabie enough to assure 
that a terxnicm needed by one nwdule was provided by another. 
Nonetheless, probiem domain dependeraaes should preferably 
be captured te the specrRcalions. so that devetopers with less 
domain expertise w# have the informatjon they need. The 
COMPASS team is representing problem domain dependencies 
in their standardzed specifications. 

2) The configuration of a system is done by Instantialing ail 

the nacessary generic Ada padiages in the correct order. The 
GENSIM team instantiated each package as a Kbrary unit in 

where the sameaslof padcages are used in each system, 
garterics can be combined so that a subsystem can be 
instantiated* through the instantiation of a single generic 


package. 

3) The legacy of the previous simulator architectures made 
the implementation of standardized components more difficult 
In particuiar, the storage of inputs and results for a given 
simulation soenaiio could not be adequately generalized. This 
Igsson is discussed more detail in the next section. 

2.2 GENSIM ea a Stendardized Archltacture 

The purpose of the flight dynamics simulatofs generalized by 
GENSIM is to test the flight dynamics control algorithms for a 
satellite before it is launched. Figure 1 shows the ardiHerture 
for a spacecraft simulator buOt from GENSIM modules. This 
diagram shows the dependencies between major simulator 
subsystems. The Tnith Model represents the True* rasponsa of 
a spacecraft to its control system, and Is configured using the 
oomponents needed for a specific sateQite. The Spacecraft 
Control subsystem contains new code that implements a 
pwlicular satellite's control laws. The remaining subsystems are 
built to support these two subsystems, and must also be 
configurable to support vaiyingvsets of modules. This 

reconfigurablity became espeewily cumbersome for the Case 
Interfooe. which is the subsystem that manages input data and 
results for simulation soenarfos (cases). Figure 2 shows the two 
major parts of Case Interface. Afl simulation inputs are 
managed by Parameter Interface, and afl results are managed 
by Results interface. These two subsystems are accessed by 
both the user and the two simulation subsystems. 

Case Interface 
Subsystem 



Rcure^ 
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Th« QENSIM oonfiguraHon concept catted for the subsystems of 
the Case Interface to be built from components associated with 
each module. Rgure 3 shows how a parameter and results 
database is created for a Rne Sun Sensor (FSS) module by 
instantiating standardized generics. The "ESS.Database* 
package is used by the module's foitialtzation routine to get initial 
parameters, and the "FSS.Resufts* package is used by the 
module's computation routines to store simUated results. The 
shaded areas show that the individual components fit Into the 
Case Interface packages. Rgure 4 shows how several module 
databases fit into the Parameter Interface subsystem. 

The advantage of this approach is that the packages 
lnteffaoe_Types and FSSJTypes contain afl the declarative 
informatfon needed to include a module in a simulator 
oonfiguretion, and that standard types and protocols are used to 
achieve this. The oonliguratfon pararnetars incfude default 
values for module input parameters, flags indfoating which 
parameters a user is allowed to change, and similar flags 
indicating what results a user may cfisplay during a simulation or 
print after a simulatioa The disadvantages of this design 
approach are 

1) thedevefoperof a flight dynamics module has to be 
aware of al the oomplextties inherent in the simulator 
architecture, and all the dependencies shown in Rgures 3 and 
4. and 


2) the perameteri passed in end out of e package are 
limited to the data types defined by Interface Types. Module 
specific enumeration types (such as "type FSS.POWER is 
(OFF.ON)*) cannot passed to the user except by using the 'POS 
attrfoute to convert to an integer which is then displayed, 
i 

Rgure 5 showsan uriprovement to tiiearchitscture that 
addresses the first dfoadvantige. The package FSS.AOT 
exporlB an abstract data type (ADT) that implements aB the 
modeflng of the fine sun sensor. Now the state of the FSS 
module is based on tttis abstract data type, and the module’s 
functfonality is impleme n ted by catting the operations on the 
type. This allcws package FSS^ADT to be implementad by a 
devefoper who is aware of all the nuances of fine sun sensor 
modeling, and the FSS module can be implemented by a 
devefoper who is aware of all the nuances of the simulator 
arcNtacture. In addilion. FSS_ADT and all the other abstract 
data types defined for the flighT dynamics simulation domain can 
be used to build a system with a oompletely different 
arcNtacture, without changing a fine of code in the packages 
that implernmt the modelmgo^^ flight dynamics problem. An 
arcNtacture that addresses the^itations imposed by 
Intarfaoe.Typee can be built around such abstract data types, as 
is shown in section 4. The separation of problem domain and 
system aicNtecture oonsidefations is a key element of the reuse 
models described in section 3. 

Parameter Interface 

Design, 



Finiiffl A C 
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FSS Modtjia's View of tha Case 
Interlace 


Ex#eutnr«. 



Figure 5. 


2J COMPASS 

COMPASS is Ihe second FDD project that is developing 
reconfigurable software, tt has the same cost reduction goal as 
GENSIM, but covers a much larger problem domain. 

COMPASS is intended to support the flight dynamics 
simulations area, mission planning and armiysis both before and 
after launch, and spacecraft attitude support systems for 
mission operationa. The estimated size of COMPASS is over a 
mBfion ines (counting afl carriage returns) of Ada source code, 
and is targeted to run on several different computers. This 
impfles both being able to configure systsms to run as 
distributad systams, arto to be able to ta^et the same furKtions 
to different platforms. These considerations have prompted 
refinements to the reuse model defined in [Booth 1 989]. 

COMPASS has also involves defining standardized 
spedfications to promote verbatim reuse. UnliKa GENSIM, a 
staixiaid specification methodology has been defined for 
COMPASS (Sekfawitz 1969]. The COMPASS specification 
concepts are objact-oriantod, but contain rasirfctions tied to both 
reoonfigurabatyaivitoprQfectstandards. For example, there is 
a restriction on the number of levels of superclasses and 
subclasses allowed in an inheritanoe hierarchy. 


3. ReuaeConoepta 

To be able to design reoonfigurable systems, it is neoessaiy to 
have some underlying principles that can be used as design 
gukfofines. The major oorKrept defined in this paper is a Layered 
Reuse Model thatca^rizes components by function and 
<fofines daparKfoTKias among tt>ese components. Theinitiai 
model was developed as a result of the work done on GENSIM 
and on an operational systam, Ihe Upper Atmosphere Research 
Satettfta (UARS) Telemetry Simulator (UARSTQJS) [Booth 
1969]. This model was primarily driven by the need to separate 
problem domain and system architecture considerations, as is 
(fisGussed in section 2. This model does not address how to 
Inooiporate very generai components that have potential use 
across several problem domains and/or architectures, nor does 
it address the separation of system deperfoent features from 
potentially portable code. The latter omission became obvious 
when a multiplatform system such as COMPASS was 
considered. 


Thfl Laverr^ Reuse Model 




- 

ItelorLaytra 
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packie* 


FloiirejS^ 

To address the above issuas, a *servioos*layer was added to 
the model. This services layer is split into a system dependwt 
and a system independent layer. The updated reuse model is 
shown in Figure 6. A component in a given layer can only 

depend on componenu in layers below ft, as is the case in any 
good layered model. The layers are defined as foltows; 

System Arch itecture Templates - Components at this level 
provide a tempiata into which modules fit These can be 
reoonfigurable subsystMns such as the GENSIM Case Interface 
discussed above, or they can be standard components that do 
rtotdeporKf on tha paiiicular configuration. In GENSIM tha 
Display Interfoos and the Plot Interface were designed to be 
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standard softwara, with any noaded oonfi^imtion data being 
provided by input flea, rather than generic inatantialioa 

System Modules - This layer oontarnacomponertts that are 
designed to fit into a standard design. These modiies are built 
from components at the problem domain and service levels. 

Ocmain Defintiton Ctassea - These components define dasses 
in the problem domain that are Identified through domain 
analysis. They are ganarailyimplementod as Ada packages 
eaporfing abstract data types, as is discussed above. 

nemain Language Classes — Comporients at tNs level capture 
thevocabuiatyof a particular domain, in other words, these 
dasses capture the knowledge and language that domain 
expwts use to express the spedfications for domain definition 
classes. In the flight dynamics domain, such dasses would 
indude ^Vector*, "matrix*, "orbiT, and "attitude”. The domain 
analyst would use these simpler dasses to define more complex 
dasses such as "Hne Sun Sensor. 

System Independent Services^ - This fatver contains 
components that can be used in implementing both the probtom 
dornain layer and architactura layer components. Theyara 
usually usable in more than one problem domain and/or more 
lhavt one system architoctura. Components at this level indude 
the generic data structures ar>d tools provided by the Booch 
Components <TM) {Booch 1d87], as wefi as portabie interfaces 
to general services such as DEC'S screen management 
routines. These portable intorfaoes can be moved to different 
computors, and new code or a different oommerdai piodud can 
be used to implernenttha same functions. Thus one ends up 
with multiple non-portabie impiamentatiorw of a single 
abstraction. Cals to this pad^e should act the same, even if 
they are implemented in a machine dependent manner. 

System Dependent Services - This layer contains all tha 
contoonents that are depandant on a particular oomputor or 
operating system. This generally indudes afl non-Ada code, as 
most other languages have different non-standard extensions on 
diftarent machines. This also indudes Ada code that 
incorpor at as system dependant fetoures such as Dirsct.iO files 
croatod with a non-null FORM paiemeter. These system 
dependent featuras should have system independent interfaces 
at a higher ievd. 

The improved modal takes an object-oriented approach to 
specifying tha problem domain. The domain definition dasses 
and domain language dasses form the two major groupings 
within the problem domain. Each of these Mo groups are also 
organoed with tha more domain spectiic dasses depending on 
the more general dasses. For example, the IBght-dynainics 
dasses torbiT and ^attitude* depend on the more general 
dasses "vector* and "tnatrix*. 

The layered reuse model does rtot depend on Ada, but the Ada 
language contains features that support this model wel. The 
use of generic packages aflows each of the problem domain 
dasses to be implementod as a generic unit that is oompletoly 
deoouplad from al other dasses. In addtion, the generic formal 
definitions assoctatod with a padcage captiae ai the information 
about dependencies in a single location, as wel as dstiibuting 
external references throughout the code. Anr^her uselul feature 
is the separation of package spedfications (and subprogram 


and task spedfications as wel) from their implementations. This 
is useful in hiding system dependent services, which can then 
have the system independent part defined at the appropriate 
layer. For example, the interface to a system deperidem math 
library would be classified within the problem domain, and the 
interface to system-dependent screen management routines 
codd be system independent services. The 5 top levels in this 
model would then contain system independent Ada code, which 
would be expected to be completely portable. Thisisnota 
oonsequenoa of attempting to make the highest layers portable, 
but rather is a benefit of isolating the known system 
dependencies, and using a standardized programming 
language. Usirtg Ada leads naturaly to havtog most reusable 
components also be portable. Similar portability jnay be 
attainable using C. it is almost certainly not attatrtable with 
FORTRAN, as the dialects vary too greatly beMeen machines. 

4. Example 

This section presents an improved GENSIM design as an 
example of how to use the layered model. This new design is 
presented at the same level of.detail as the original GENSIM 
design presented In section 2.' Figure 7 shows the improved 
simulator design. 

Improved Simulator 
Architecture 



Figure 7 

The key rfifferenoes in tNs design are the location of the Case 
Interface subsystem and the new I/O services subsystem. In 
additon, tha Spacecraft Contrcf, Tnjth Model, and Utifities 
subsystem are combined into the Sknuiator subsystem. Figure 
8 shows that the dependencies between these three subsystems 
are the same as in the original architecture (Figure 1), but that 
now none of these subsystems depends on Case Interface. 


5-6 


6109 


This extra dasign level is not carried through to 
implementation. Subs^lems may be implemented as a single 
package which provides an intertaoe to ail the subsystem's 
components, but in tNs case the Simulator subsystem is merely 

simulator Subsystem 



a logical grouping intended to reduce the design oompleaty, 

Rgure 7 also shows the three major layers of the reuse model. 

In this design, the I/O services consist of standard Ada packages 
such as TextJO or Direct JO, and an interface to DEC'S Screen 
^toagernent'Guiderlnos (SMG) routines. Rgure 9 shows the 

intenelationshtp between the FSS riKKluto arxl the simulator 
architecture. Here the abstract data type for a sensor is created 
by inslantiatir>g a generic package. The generic ADT is 
designed so that ail external dependencies are captured in ihe 
generic fdmiai part These dependencies include types provided 
by the simulator's Math.Types pack^e, and functions to select 
information from the Sun arid Dyrtamics modules. The 
FSSjObjects package uses the AOT (private type) exported by 
the ^_ADT package to define its package state, and the 
FSS^Panuneters.Oispiaypackageusasvlstble types exported 
by FSS_ADT to define parameter screens. The 
FSS.PaiBmeter Displ^ package also instantiatBs 
EnumoratfonJo”using Type FSS_POWER is (OFF.ON)’ as Ihe 
parameter. This removes the relianoe on using the *POS 
attribute of enumerated types that has been a feature of a0 FDD 
simulators up until now. 

Figure 10 shows how the FSS_Parfflneter_Oisplay package fits 
into the design of the Case Editor subsystem. The Case Editor 
subsystem is the part of the Userjnterface that allows a user to 
change any of the initial parameters for a simulation. The 
Parameter.EdItor package tracks which displays the user has 
selectad and calls the appropriate parameter display package. 
The difference is that now the Userjnterface controls the 


inlliafizalion of simulation parameters, rather than the simulator 
components requesting intial values from a database contained 
within the Case Interface. 

In this example, the use of the layered model removes the Truth 
Moders complex dependencies on the Case interface packages 
shown in Rgure 3. This enables the Simulator subsystem 
components to be usable within more than one architecture. 

The placing of the system architectura subsystems above the 
Simulator subsystem also allows general purpose service layer 
components to be enhanced as needed to Integrate a given 
module into the system architecture. The 
FSS_Parameter_Display demonstrates Ihis concept by using 
Enumeration JO to add to the general lO services. 

FSS Modute Design 


UMTint«rfaov 



5. Future Directions 

This paper describes a general reuse model for designing 
reoonligurabte systems. The next step is to map the Isyered 
reuse model to Ada design and implementation concepts. The 
high-level designs presented in this paper use generic packages 
to help parametrize systams. Thera are many possible ways to 
incorporate generic packages into a larger design. These "reuse 
in the smaiT techniques include nesting generic instantiations, 
nestirtg generic definitions, and creating dependencies between 
library instantiations (8o<^ 1939]. This paper has used the last 
technique so that while generic instarTtiationa afa coupled, each 
of the genenc units is completely decoupled from the others. 

The layered reuse model provides a sound basis for project 
management By strictly separating the problem domain issues 
from the system architecture issues, a manager can assign the 
appropriate experts to implement packages within each layer of 
the model, improving the aAocation of personnel to tasks should 
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improve both prcxiuctn% and software quality. As this model is 
us^ an understanding of what proportion of a system fafis into 

User Interface Case Editor 
Design 



S«fVteM 

Figure 1 0 


which layer wilt evolve. 

The layered reuse model also can be used to understand which 
software is most crhlcaL Layered models have seen the most 
use in operatirtg system desigiv The kernel of an operating 
system typically requires the most attention, despite the fact it is 
a reiativ^srnal proportion of the code. This is because ail 
other layers depend on its correctness and efRctency. The 
analogous layers in the reuse model are the service layers and 
the domain language layer. Additional evidence lor the 
assertion is that the FDD has observed performance 
degradation in its Ada simulators due to the inefficient 
implementation of mathematical utilities packages. 

In addition to the performartoe problems observed above, there 
is a ooTKem that layered iniplernentation models may be 
inherently slow due to the addition of extra levels of procedure 
calls to aooompllsh the same work. The FDD encountered this 
problem with a oommerdaly provided graphics interface that 
provides the same FOflTRAN interface routines on a VAX or an 
I8M mainframe. Whether this is due to extra procedure cafls or 
generafly inefficient implernentotion is undear. Ada addresses 
the fonner problem by providing pragrna Inline. The latter 
problem must be addressed by improving the software. If the 
software design and implementation is done property, the 
layered reuse model should not degrade performance. 

6. Conclusion 

In ”OomairvOirected Reuse*. Braun and Prieto-Oiaz extract 
properties that are cowman to applications (such as compiler 


design) where a high degree of reuse is already being obtained 
(Braun 1989]. These properties are afocus on a particular 
application domain, assumptions about system architecture 
corwtraints, and a set of gerwralized aixi well defined interfaces. 
The layered reuse model provides design concepts for 
examining applicatione domains and defirting standardized 
architectures. These kechnlques wifi help realize the potential 
inherent in the oonoept of domain dsected reuse. 
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Abstract 

Formal specifications can enhance the quality, reli- 
ability, and even reusability of software; they are 
precise, can be complete in some sense, and are 
mechanically processable. Despite these benefits, 
formal specifications are seldom used in practice 
for several reasons: programmers lack an adequate 
background; both concepts and notations in spec- 
ification languages appear obtuse to programmers; 
formal specifications atre sometimes too high-level, 
providing too large a gap from the specification to 
the implementation; methods are not tailored to the 
environment; amd fully formal methods are expen- 
sive and time consuming. 

In this paper we present PUC (pronounced 
pook), a specification language for Ada that ad- 
dresses programmers^ concerns for understandabil- 
ity. PUC is a functional language whose syntax 
and data types resemble Ada’s, although it has 
features like paraunetric polymorphism and higher- 
order functions. The paper shows the require- 
ments for the language PUC; presents an overview 
of the language and how it is used in the specifica- 
tion of Ada programs; and gives the requirements 
and strategies for a semi-automatic translator from 
PUC to Ada. 

1 Introduction 

The practiced use of formal specifications in pro- 
gram development is am importaint goal in soft- 
ware engineering because formal specifications can 
enhance the quality, reliability, and reusability of 
software. Formal specifications are precise, can be 
complete (in some sense), and ate mechanically pro- 
cessable (e.g., consistency checks). Since one of the 

* Research supported in part by NASA Goddard Space 
Flight Center grant NSG-5123. 

^ Additional support from ODEPLAN and Catholic Uni- 
versity of Chile. 


main problems in software reusability is determin- 
ing what is the functionality of a subprogram or 
module, having a precise description will also im- 
prove software reuse, lowering costs and improving 
quality by using already tested components. 

Despite the benefits of formal methods, few are 
used in practice. There are several reasons: pro- 
grammers do not have adequate background; both 
concepts and notation in specification languages 
are usually mathematically oriented; there is a big 
conceptual gap from a very high level specification 
down to the details of the implementation; methods 
cannot be tailored to the environment; fully formal 
systems are very expensive and time consumming, 
and much software is not critical enough to justify 
this cost. 

A precise mathematical specification is useful 
only if it is understood by the persons involved in 
the development, so notational considerations are 
very important. Two aspects of the specification 
language have to be considered: the conceptual or 
semantic level and the syntactic level. The concepts 
represented in the language have to be very high 
level, like the concepts in the domain area, but not 
too high level, or else there will be a big conceptual 
gap from the specification to the implementation. 
Hence, there is a trade-off in the design of a spec- 
ification language: if it is not very high level, the 
program analysts and designers have a hard time; 
if it is too high level, the implementors must fig- 
ure out the algorithms from scratch. This trade-off 
is summarized by the question of how much design 
should be implicit on the specifications [13]. 

The second aspect of notation is syntax. Syn- 
tactic issues are sometimes dismissed as syntactic 
sugar] this is fine for a researcher who knows many 
programming languages and can learn another one 
very fast, but for most professional programmers 
syntax is important. In particular, a syntacc that 
is similar but conflicting with the implementation 
language is confusing. 
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This work grew out of studies within the Software 
Engineering Laboratory (SEL) of NASA Goddard 
Space Flight Center. The SEL has been monitor- 
ing the development of ground support software for 
unmanned spacecraft since 1976. Our goal is to im- 
prove the quality of software specifications within 
the SEL, to improve both software development and 
testing [13]. We approach this goal by incresing 
the use of formal methods in software specifica- 
tions. The SEL environment is characterized by 
large (tens of K lines of code) scientific software 
with complex functions and complex structure; po- 
tential reuse of products and processes; programs 
written in Ada using object-oriented design; few 
critical timing constraints; ztnd programmers with- 
out background in logic and abstract algebra. To 
increase formality of specifications, we designed the 
especification language PUC suitable to this kind 
of environment; in particular, users of PUC are not 
required to know advanced logic or abstract alge- 
bra. 

Overview of the paper. The next section dis- 
cusses the need for a new language. Section 3 
presents the principal aspects of the PUC language, 
along with examples. Section 4 shows how Ada pro- 
grams can be developed using PUC as the specifi- 
cation language. The example presented is a sim- 
plified telemetry processor for satellite data. The 
final section contains a summary, conclusions, and 
a description of further work. 

2 Why Another Language 

There aire many specification languages, yet we have 
not found any that is suitable for our needs. This 
section motivates the design of PUC, by presenting 
previous work, design objectives and rationale. 

2.1 Previous work 

Specification iamguages proposed specifically for the 
Ada programming lemguage are based either on first 
order predicate logic, Horn clauses, algebraic ab- 
stract data types, or procedural description. 

Booch proposes to use Ada itself as a specifica- 
tion lemguage for Ada programs. “Not only is Ada 
suitable as an implementation language, but it is ex- 
pressive enough to serve as a vehicle for capturing 
our design decisions.” [1, page 50] However, most 
design decisions that can be written in Ada are of 


syntactic nature. This includes functional decom- 
position, but the meaning of subprograms cannot be 
expressed in Ada without writing them in whole. 

Anna (ANNotated Ada) is a specification lan- 
guage designed to provide machine-pro cess able ex- 
planations of Ada programs [9]. Anna programs are 
Ada programs with formal comments, that describe 
the functional requirements for the program; prop- 
erties of its components (variables, subprograms, 
modules); and how these components interact. For- 
mal comments atre in the form of prq- and post- 
conditions, module invariants, type constraints, and 
other assertions. Anna programs are executable be- 
cause they are Ada programs, but the specifications 
themselves are only executable in the form of run- 
time testing for consistency. 

The PLEASE specification language for Ada is 
based on logic restricted to Horn clauses [14]. 
PLEASE borrows from Anna the idea of writing, 
formal comments in Ada programs. Programs in 
PLEASE are executable so they can be used to 
build prototypes, in which incomplete Ada pro- 
grams call some procedures specified in PLEASE. 
Unfortunately, pure Horn clauses are so inefficient, 
that operational semantics (order of evaluation and 
PROLOG cut command) have to be explicitly de- 
clared complicating the specification. 

The specification language Larch/Ada-88 also 
uses formal comments within Ada programs [11]. 
This language is one of the interface languages of 
the Larch family of specification languages. Larch 
specifications are done at two levels: the meaning 
of the abstractions used by the program are defined 
using the Larch shared language [5], and then one of 
the Larch interface languages is used to state what 
the program does in terms of these abstractions. 
The Larch shared language is based on algebraic 
abstract data types. Using Laxch/Ada-88 and the 
prototype tools described in [11] it will be possi- 
ble to develop verified Ada programs, hence this 
method is fully formal. 

2.2 Design objectives 

We set several specific goals in the design of the 
language to madce it useful in the SEL environ- 
ment. These goals are sometimes conflicting with 
each other. 

• The language should bridge the usual gap be- 
tween very high level logical specifications and 
the detailed data and control mamagement in 
Ada. 
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• The language should be expressive and exten- 
sible. It should be easy to code domain specific 
concepts in a specification library. 

• Specifications should be easily translated into 
Ada programs; only rarely used constructs are 
allowed not to have a simple Ada representa- 
tion. 

• The language should be easy to learn for 
Ada programmer. It should have few concepts 
and very few concepts not present in Ada. Syn- 
tax should be Ada-like. 

• The language should be executable, so that the 
specifications can be used as a prototype and 
in preparing test data for the final application. 

2,3 Design rationale 

Our first design decision is the semsmtic model on 
which PUC is based; that is, whether PUC spec- 
ifications will consist of Horn clauses, procedural 
descriptions, etc., either purely or in combination 
with other semantic models. 

Some researchers in formal specifications have 
advocated using both purely functional languages 
[3, 6, 16] and logic-based languages [8] for specifi- 
cations, based mainly on the separation of concerns 
between what is intended and how it is achieved, es- 
pecially in the management of data structures. The 
expressive power of logic languages and functional 
languages is not comparable, because logic lan- 
guages can accomodate non-determinism whereas 
functional languages can be higher order [16]. Even 
though both logic and functional languages can be 
executable, we agree with Hoare in that “a mod- 
ern functional programming language can provide 
a nice compromise between the abstract logic of a 
requirements specification and the detailed resource 
management provided by procedural programming’’ 
[7, page 90]. These arguments have influenced our 
decision to design a purely functional programming 
language to specify Ada programs. 

Most functional languages have mathematical 
notation which makes them amenable to formal 
proofs; however, they have been developed for 
prograuiuners with extensive mathematical back- 
ground. Our goal in the design of PUC has been to 
make a specification language for Ada programmers 
who do not necessarily have this background. If for- 
mal proofs axe needed, PUC specifications can be 
easily translated into recursion equations to prove 
properties of them. 


Hence, both syntax and semantics of PUC axe 
simil 2 ir to familiar programming constructs. For ex- 
ample, instead of free algebras and pattern match- 
ing, in PUC there are variant records and case ex- 
pression. The few constructs of PUC that are not 
present in Ada are explicit. For example, poly- 
morphism is explicit in the declaration of polymor- 
phic objects, and Curring (i.e., creating a higher or- 
der function by partial parameterization) is accom- 
plished using predefined functions instead of just 
omitting parameters. 

3 The Specification Language 
PUC 

This section presents the main aspects of PUC. A 
technical report gives further details and a BNF 
description of the grammar [12]. 

3.1 Overview 

PUC is a purely functional programming language 
with paxametric polymorphism [2] and Ada-like 
syntax designed to serve as a specification language 
for Ada programs. Because PUC programs are exe- 
cutable, we will call PUC either a specification lan- 
guage or a programming lamguage appropriate for 
prototyping. 

A PUC program consists of a sequence of decla- 
rations of types and objects (functions and data). 
Type declarations give a name to a type and are 
needed to create new types. Object declarations 
give a name to an object, which represents a func- 
tion or data object; they are either like Ada function 
definitions or like Ada assignments, where the defin- 
ing symbol := is read as is equal by definition and 
represents the relationship of that object to other 
objects. This results in implied execution sequences 
by virtue of the partial ordering of these object re- 
lationships. 

Example The following program consists only 
of data object declarations. The value of root is 
computed from the values of a, b, and c. 

root ;■ (- b + sqrtCb^b - 4<»a*c)) / 2; 
a 2.0; 
b -4.0; 

c 2.0; 

Example The program below defines result to 
be the factorial of 5. The program consists of two 
object declarations: fact and result. 
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result :■ fact(5); 

function fact (n: integer) return integer is 
begin 

if n * 0 then 1 else n ♦ fact(n-l) end 
end; 

3.2 Data types 

PUC is a strongly typed language, like Pascal or 
Ada. However, PUC types are higher level that Ada 
types. For example there are lists instead of atrrays; 
recursive records instead of records and accesses. 
That means that PUC is easier to use, but not as 
efficient as Ada. There are four kinds of data types 
in PUC: scalar types, list types, record types, and 
function types. 

The scalar data types in PUC are: integer, 
real, boolean, character, and enumerated types. 
Numeric types have the usual arithmetic operators 
(+ - ♦ / rem); the boolean type has the operators: 
not, and, and or; and relational operators (= /= < 
<= > >=) are defined for scalar types. Precedence 
rules are the same as Ada. 

Lists are unbounded sequences of objects of the 
same type. Constant lists are represented using 
square brackets. The catenate operator is sub- 
scripting and slicing (sublist) is done using paren- 
theses. Strings are simply lists of characters. 

Example Given the definition of nums, the fol- 
lowing equalities hold. 

nuns [10.20,30,40,50,60,70,80,90,100]; 

nuns (4) * 40 

nuns(2..3) - [20,30] 

[20,30,10] * nuns (2.. 3) k nums(l..l) 

nuns(8..8) * [nuns(8>] 

length(nun) * 10 

'•string" ■ ['s' , *t ' , 'r* , 'i\ , 'gG 

PUC records are very similar to Ada records; 
component selection uses the typiceJ dot notation. 
Records cam have variant parts and cam be recur- 
sive. Variant records have components that depend 
on a tag, whose type must be boolean or am enu- 
merated type. For example, type expr is a recursive 
record with variants to represent arithmetic expres- 
sions of integers. 

type expr.kind is (number, plus, minus, 
multiply, divide) ; 

type expr is 
record 

case kind: expr.kind is 

vhen nunber •> val: integer; 


when others *> left, right: expr; 
end; 

end record; 

The null record — compatible with all record 
types and similar to Ada’s null access — is used 
to build finite recursive records without variant 
parts [10]. For example, the recursive record type 
int-tree represents binary trees of integers. Note 
the use of the type name ais a constructor for con- 
stant records. 

type int.tree is 
record 

datum: integer; 
left, right: int^tree; 
end; 

a.tree := int_tree'(5, null, 

int .tree ' (8 ,null ,null ) ) ; 

In addition to the arithmetic, list, and record ex- 
pressions, there are two structured expressions, if 
and case. The syntaix for these expressions is simi- 
lar to the corresponding statements in Ada; the dif- 
ference is that in place of a sequence of statements, 
a single e.xpression is expected. 

3,3 Functions 

Functions in PUC behave like mathematical func- 
tions, mainly due to their declarative — as opposed 
to imperative — nature. Table 1 shows the main 
differences between Ada and PUC functions. Al- 
though functions are declared using a syntax simi- 
lar to Ada, the text between the begin and the end 
is not a sequence of commands, but an expression. 
Usually this expression will involve conditionals and 
recursion. 

Example Function eval evaluates an expression 
represented with the type expr from Section 3.2. 

function eval (exp: expr) return integer is 
function eval.oper (exp: expr) is 
1 :■ eval (exp. left) ; 
r :» eval (exp. right) ; 
begin 

case exp. kind is 

when plus *> 1 + r 

when minus => 1 - r 

when multiply => 1 ♦ r 
when divide => 1 / r 
end 

end eval.oper; 
begin 

if exp. kind ^ number then exp. val 
else eval^oper(exp) end 
end eval; 
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Ada Functions 

PUC Functions 

Can cause side effects 
Can be generic 

Cannot return a function as result 
Can have local types, functions, 
procedures, variables, constants, ... 
Body expressed using control flow 
statements 

No concept of side effects in PUC 
Can be polymorphic amd higher order 
Fully higher order 

Can have local types, functions and constants 

No concept of control flow; only conditionals 
and recursion 


Table 1: Differences between Ada and PUC functions. 


3.4 Polymorphism 

An object is polymorphic if it can have more than 
one type. PUC has parametric polymorphism, 
where the type of an object can depend on another 
type [2]. This is similar to generic type parame- 
ters, adt hough more general, PUC has polymorphic 
functions and polymorphic record types. Polymor- 
phic functions aure declared by preceeding the types 
of the parameters by a question mark (this declares 
an implicit type peirameter). 

Example The following polymorphic functions 
operate on lists of any base type. 

function length (L: list of ? element) is 
begin 

if L ■ □ then 0 else 1 + length(rest(D) end 
end; 

function cons (elem: ?a; L: list of ?a) 
return list of a is 
( [elem] k L ) ; 

function find (value: ?a; L: list of ?a) 
return list of a is 

begin 

if L * □ then L 
elsif L(l) ■ value then L 
else findCvalue, rest(L)) 
end 
end; 

Polymorphic records atre used to define different 
records given a base type; they are also called type 
constructors. The example below defines a type 
constructor for binary trees which is used in the 
definition of a binary tree of integers, 

typo tree of olem is 
record 

datum : elem ; 

left, right: tree of elem; 
end; 


type int.tree is tree of integer; 

Polymorphic types are usually used in conjunc- 
tion with polymorphic functions that operate on the 
type. For exeimple, function traverse_tree builds 
a list from the in-order traversal of a binary tree. 

function traverse.tree (t: tree of ?elem) 

return list of elem is 

begin 

if t - null then □ 

else traverse.tree(t . left) k [t. datum] k 
traverse^tree (t , right ) 

end 

end; 

3.5 Higher order functions 

Higher order functions are those that have functions 
as parameters or compute a function as a result. A 
limited form of higher order functions is present in 
languages like FORTRAN or Pascal, where it is pos- 
sible to specify a subprogram passed as a parameter 
to another subprogram. PUC is fully higher order 
because it imposes no restrictions on the kinds of 
higher order functions (e.g., a function can return 
a higher order function). A very limited form of 
higher order functions can be simulated in Ada us- 
ing generics. 

Usually higher order functions are polymorphic 
because they operate on polymorphic data struc- 
tures (e.g., lists), but these two language features 
are independent. Figure 1 shows the definition 
of some standard higher order functions which are 
useful in defining other functions without explic- 
itly writing the whole functions; that is, the use 
of higher order functions enhances reusability. Fig- 
ure 2 shows several functions defined in terms of 
polymorphic functions; some of them were previ- 
ously defined explicitly. 


5-13 


6109 




— APPLY a list with the application ol f to the elements of L 

function apply (f: function(?a) return ?b; L: list of ?a) return list of b is 
begin 

if L » □ then □ «lae [f(L(l))] 4 apply (f, rest(L)) end 
end; 

— FOLD.R - the right folding of list L with function f 

function fold.r (f : function(?a,?b) return ?b; init: ?b; L:list of ?a) return b is 
begin 

if L * □ then init else f(L(l), fold.rCf , init , rest (D) ) end 
end; 

— FOLDER. 1 - the right folding of nonempty list L with function f 
ftinction fold.r.l (f: function(?a,?a) return ?a; L:list of ?a) return is 
begin 

folder (f, L(l), rest(D) 
end; 

— CURRY - a function like f, but with the first parameter fixed 
function curry (f; f\mction(?a,?b) return ?c; paraml: ?a) is 
begin 

function (param2:b) return c is f (paraml ,pau:affl2) 
end curry; 

— FOLD.TREE - the folding of binary tree t with fimction f 

function fold.tree (f: function(?bp?a,?b) return ?b; init: ?b; t: tree of ?a) return b is 
begin 

if t » null then init 

else f( fold.treeCf ,init,t.left) , t. datum, fold_tree(f , init ,t .right) ) end 
end fold.tree; 


Figure 1: Some standard polymorphic functions. 


function traverse^tree (t: tree of ?a) return list of a is 

function combine (l:list of a; elem:a; rtlist of a) retxim list of a is (1 4 [elem] & r) 
begin 

fold, tree (combine, □, t) 
end traverse. tree; 

function sum.of. nodes (t: tree of integer) return integer is 

function add3 (1, elem, r: integer) return integer is (1 + elem + r) ; 
begin 

fold.tree(add3 , 0, t) 
end s\im_ of .nodes ; 

function concat (L: list of list of ?a) return list of a is 
begin 

fold.rC'ft”, □, L) 
end; 


Figure 2: Functions defined using polymorphic functions. 
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4 Developing Ada Programs 
with PUC 

There are several approaches for developing Ada 
prograjns using PUC. One is to use PUC only as a 
formal documentation aid, taking advantage of its 
defined semantics, but not its executability. Using 
PUC simply as a notation requires in principle no 
software tool, but this is very limited; at least a 
parser auid consistency checker has to be provided. 
But if there is a parser then it is relatively easy to 
build a translator or interpreter, so that specifica- 
tions in PUC can be used as prototypes. 

Another way of using PUC specifications, is to 
generate Ada implementations by means of a semi- 
automatic translation, in which a programmer de- 
cides implementation issues amd can even modify 
the generated code. This choice seems to be more 
attractive than the others, because it provides a 
smooth transition from specifications to programs, 
but the caveat is that not all PUC constructs have 
a simple representation in Ada (e.g., Ada has no 
higher order functions). 

These approaches are not fully formal develop- 
ment systems in the sense that it is still possible to 
write a program inconsistent with its specification. 
While this is not optimal, we think that our soft- 
ware engineering environment is not mature enough 
for a fully formal system, and that experience with 
semi-formal specifications (and development) is re- 
quired before a fully formal development system can 
be used effectively. 

In order to provide a translator from PUC to Ada 
it is first necessary to determine a set of transla- 
tion rules that will preserve the semantics of the 
specification. Although this set will not be suffi- 
cient to translate any PUC program into Ada, we 
need to be able to translate most PUC programs, 
or else the method is impractical. There is an addi- 
tional restriction we impose on the system: to facili- 
tate mamual modification of the generated Ada code 
(e.g,, for optimization or maintainance) we want the 
generated Ada code to resemble the PUC specifica- 
tion. 

Since PUC is syntactically similar to Ada, some 
PUC constructs require simple translations or even 
no translation at all. For example, enumerated 
types and simple record types are almost identical 
in both languages; recursive record types are trans- 
lated into an access type and a record containing 
access fields. However, not all translations are so 
simple, because the semantics of PUC and Ada are 


quite different. It is particulary difficult to pro- 
vide general amd efficient translations for the use of 
(garbage collected) heap memory, lists, higher order 
functions, and polymorphism. 

4*1 Memory management and func- 
tions 

PUC functions cam be translated to Ada functions 
or procedures. If procedures are used, there are 
choices in the parameter modes used (i.e., IN, OUT, 
IN OUT). It is not always possible to select any of 
the choices, though, because they depend on the 
way data is manipulated in the calling functions. 

This brings up the issue of how memory is man- 
aged. The semantics of functionad languages with 
automatic allocation amd deadlocation of memory is 
quite different from that of Ada. In Ada only local 
variables are atllocated and deallocated automati- 
cally, because of the activation stack model used, 
whereas in functional languages all memory is allo- 
cated and deallocated automatically. An immediate 
consequence is that we will try to allocate as much 
memory as possible in the form of local variables, 
avoiding the use of the Ada heap. To do that we 
have to recognize when data can be stored safely in 
the stack (i.e., when we can be sure that data will 
not outlive the function call where the value was 
declared). One of the problems of this approach is 
that it complicates sharing. 

Another important issue in the management of 
memory is when to use variables. In functional lan- 
guages there are no updatable variables and that 
means that every value computed needs newly allo- 
cated memory. We want to take advantage of Ada 
vairiables to avoid these allocations, even if they 
ocurr in the activation stack. For example, tail re- 
cursion can be translated into loops that will use 
variables for the information that is passed to the 
next activation (i.e., iteration). 

4.2 Translating lists 

There are several ways to translate lists into Ada, 
based on arrays or linked lists. When lists have a 
fixed known length, they can be translated into Ada 
arrays. If the length is not fixed but there is a rea- 
sonable upper bound, lists can be represented by a 
record with an array and a count of used elements. 
When the length of the lists is highly variable or not 
bounded then a linked list representation is used, 
using a predefined generic package. In the case of 
strings, it is desirable to use Ada strings, so that 
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string variables are compatible with string literals. 
Array representations have advantages over linked 
lists because they can be more efficient eind gener- 
ated Ada code resembles closely the PUC code. 

It is very difficult for the trauaslator to detect 
whether a list can be represented by an array or 
not. On the other hand, if an array representation 
is chosen some upper bound has to be provided, so 
this translation cannot be done automatically. One 
solution to this problem is to provide a default rep- 
resentation with linked lists and let the programmer 
chsuage that default. The default representation for 
strings are Ada strings. For each type that requires 
a non-default representation, the programmer has 
to specify which translation is desired. This trans- 
lation applies to all objects of the type. 

4.3 Translating higher order func- 
tions 

Higher order functions are used often in specifica- 
tions, because they are useful in representing ab- 
stract operations. Ada generics can represent uses 
of higher order functions in the particular case of 
functions passed as parameters, provided that all 
function pairameters are statically known. This 
translation requires defining the function as generic 
amd providing the corresponding instantiations. 

It is baud to make a general translator for higher 
order functions. However, most programs use 
higher order functions that either satisfy the re- 
strictions in the above paragraphs, or belong to a 
standard predefined set (e.g., the examples in Fig- 
ure 1.) For the first case, the translation scheme 
described suffices. For the second case it is possi- 
ble to have a set of ad-hoc translation rules for the 
st 2 uideLrd higher order functions. These rules are 
semantic-preserving tramsformations coded into the 
translator [15], hence programs written in terms of 
standaud higher order functions can be automati- 
cadly translated. The system cam be extended by 
adding translation rules for domain-specific higher 
order functions. 

Example Function poly evaluates a polynomial 
represented by the list [ao, ai, . . . , an] of its coeffi- 
cients, using the factorization 

P{x) = ao + x{ai -h x(a 2 -h . . . xon . . .))* 

function poly (as: list of real; x: real) is 

function conbine (a.i, accun: reail) is 
(a_i + X * accun) ; 
begin 


fold_r.l( combine, as ) 
end; 

The use of function f old-r-l (defined in Figure 1) 
can be transformed into a loop using the rule for 
fold-T-l. Prom the definition of fold_r-l, if as 
has only one element, then the result is equal to 
this element. If as has more than one element (say 
as = [first] k rest) then the result is equal to 

combine ( first, fold,r_l (combine, rest) ) 

That is, we can first compute the folding of the 
rest and then combine the result with the first 
element. This can be achieved by a loop that exam- 
ines the elements in reverse order and accumulates 
the results of the folding. The first time the list 
will have only one element that is used to initialize 
the accumulator. Since we know that the loop will 
iterate length(as)-l times we can use a for-loop. 

accuM :■ asdength(as)) ; 
for i in reverse 1 .. length(as)-! loop 
result :» combine (as (i ) , result); 
end loop; 

To write the above loop in Ada we need to provide 
an implementation for lists amd perform the corre- 
sponding translation on them. Note that this loop 
will be inefficient with linked implementations for 
lists, because foldjT-l accesses the elements in re- 
verse order. Now we can expand the call to combine 
and produce a complete Ada function. 

4.4 Translating polymorphism 

Some polymorphic functions can be translated into 
generic functions with type variables. This is not 
true of all polymorphic functions, because paramet- 
ric polymorphism is a type system more powerful 
than generic types. The restriction is that all uses of 
a polymorphic function must be monomorphic (i.e., 
it should be possible to assign a static type to every 
use of a polymorphic function). That means that a 
polymorphic function cainnot call another function 
using polymorphic parameters. This restriction is 
in principle rather severe, but does not apply to pre- 
defined operators and functions whose invocations 
are translated by ad-hoc rules. 

The difficulty with this approach is that all func- 
tions on the polymorphic type have to be explictly 
declared. For example, if we have a function to op- 
erate on lists, all list primitives used have to be de- 
clared, and the function can be generic on both the 
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generic 
type a; 

type list.of.a; 

with function enpty^list return list.of^a; 
with function first (1: list.of.a) return a; 
with function rest (1: list_of_a) return list^of.a; 
function find (value: a; L; list^of.a) return list_of,a is 
result: list^of.a; 
begin 

result :» L; 

while not ((result » empty .list) or else (first (result) - value)) loop 
result :=* rest (result) ; 
end loop; 
ret\im result; 
end find; 


Figure 3: Find the longest sublist containing value. 


base type and the list type. Figure 3 is the transla- 
tion to Ada of function find from Section 3.4. 

Polymorphic records can be translated into sev- 
eral record declarations, one for each instantiation. 
As with functions, all uses of polymorphic records 
have to be monomorphic, or else the translation 
cannot be done automatically. 

Example The polymorphic function foldJ. 
folds a list into one value by combining values pair- 
wise from the left of the list. Since f old_l is also 
higher order, the techniques discussed above apply 
as well. 

function fold^l (f : function(?a,?b) return ?b; 

accum: ?b; L: list of ?a) return b is 

begin 

if L = n then accua 

else fold.Kf, f (L(l) , accua) , rest(D) end 
end; 

FoldJ. can be transformed into a while-loop (it is 
tail recursive). Consider the following call to foldJ 

result foldJ (f, value, ajist); 

From the definition of foldJ, if aJist is the 
empty list □, then result is equal to value. 
If aJist is not empty (say aJist = [first] ft 
rest) then the result is equad to 

fold.Kf, f (first , value ) , rest) 

so that this is a call to the same function, in which 
both value and aJist are updated accordingly. 
Hence the following while-loop in pseudo-Ada is a 
valid translation: 


result :* value; 
aux.list := a^list; 
while aux^list /= [] loop 

result := f (aux.list (1) , result); 
aux^list := rest (aux.list) ; 
end loop; 

An obvious efficiency improvement is to use an in- 
dex variable, updating this variable instead of copy- 
ing a list. Furthermore, since the loop will iterate 
length(aJist) times we can use a for-loop. 

result value; 
for j in 1 . . length(a.list) loop 
result := f(a_list(j), result); 
end loop; 

To write the above loop in Ada we need to provide 
an implementation for lists and perform the corre- 
sponding translation on them. Unlike the transla- 
tion for foldjrJ, this loop is efficient with linked 
implementations for lists because the elements are 
accessed in order. 

4.5 Example: A simplified telemetry 
processor 

A telemetry processor is a program that interprets 
telemetry data sent from a spacecraft. Satellite 
telemetry data is a sequence of samples, each con- 
taining a set of measures representing the status of 
the spacecraft systems [4], Data is transmitted to a 
ground station in binary form, packed in fixed-size 
bit matrices called master frames. 

The telemetry processor takes this coded data 
and produces calibrated data in engineering units 
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(e.g., meters, Watts) in floating point format. The 
calibration is done by extracting each measure from 
the master frame and evaluating a polynomial on its 
value. Besides, some measures require maximum 
and minimum limit check. The input to a teleme- 
try processor is a master frame and a set of descrip- 
tions of measures. The output is a set of calibrated 
measures. These sets will be represented by lists. 

The following PUC type declaration represents a 
master frame as a list of lists of bits. 

type bit is (On, Off) ; 

type ro¥ is list of bit; 

type master^frame is list of row; 

Each row in the master frame is a fixed- length 
bitstring considered to be divided into several bit- 
strings of various lengths representing measures . 
Measures are described by the following attributes: 
name, position in the master frame, and calibration 
paurameters. The position in the master frame in- 
cludes the row number and the first amd last bit po- 
sitions within the row. Calibration parameters for 
each measure are: coefficients for the polynomial, a 
check- range flag, and minimum and maximum val- 
ues (used if the flag is true.) 

type aeasure_description is 
record 

name : string ; 

row^nuB : integer; 
first _bit : integer; 
last.bit : integer; 
coeffs : list of real; 
do_chec3c : boolean; 
min_ value : real; 
nax.value : real; 
end; 

Calibrated measures are described by the name 
of the measure, the result of the polynomial evalu- 
ation, and a range check code that is either Small, 
In-range, Large, or Ko.check, depending on the 
range check of the value. 

type range.code is (Small, In-range, 

Large, No^check) ; 
type calibrated.measure is 
record 

name : string ; 
value : real ; 
range : range.code; 
end; 

The main function of the specification is cali- 
brate .master, which returns a list of calibrated 


measures given the master frame and a list of mea- 
sure descriptions. It . is defined apply’ing func- 
tion calibratejneasure to each measure descrip- 
tion (Figure 4.) Function caJ.ibratejneasure uses 
function extract to obtain the bitstring of the mea- 
sure, function tojiumber to convert from binary 
to floating point, and function poly.eval to evalu- 
ate the corresponding polynomial. The range check 
code is computed with a nested it expression. 

To generate an Ada program we need to provide 
translations for functions like apply, curry, etc. 
We aiso have to decide how each list will be im- 
plemented. Figure 5 is the main program in Ada. 
The list of measure descriptions in represented by 
an array because the number of measure decrip- 
tions is fixed for each satellite. The apply function 
is translated into a for-loop because the size of the 
list is a constant. The curry function is not explic- 
itly translated: it is only a notation to provide the 
additional parameter within the loop. An e.xplicit 
list of calibrated measures is built in local variable 
result, which is the returned value. 

5 Conclusions 

We have presented a specification language suit- 
able for a specific class of software engineering 
environments using Ada. The main purpose of 
this language is to bridge the gap between very 
high level specifications and detailed algorithms and 
data structures, so we have attempted to define con- 
structs that are similar to those in Ada, especially 
in data structures. On the other hand, the need to 
represent application level concepts has led us to in- 
clude features like higher order functions and poly- 
mophism, to increase the reusability of the specifi- 
cations. 

We had to make several trade-offs in the design 
of PUC, because we wanted expressiveness, sim- 
plicity and similarity to Ada. We decided not to 
include algebraic data types aud pattern matching 
(present in several functional languages); the more 
familiar concepts of variauit record and case expres- 
sion were used instead. We included parametric 
polymorphism, higher order functions, and Curring 
(i.e., partial pau'ametrization of functions), but since 
these are advanced features, we wanted them to 
be explicit. Having these constructs complicated 
the process of translation from PUC to Ada, but 
they provided the abstraction mechanisms needed 
in a specification language. Hence, we studied semi- 
automatic methods of translation. 
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iTinction calibrate.master (aaster : master_fraae; 

measures: list of measure^description) 
rettim list of calibrated^measure is 
begin 

applyC curry (calibr at e.neasure, mast er ) , measures ) 
end; 

function calibrate.measure (master: master_f rame ; measure: measure.description) 
return calibrated_measure is 

bits :* extract (master, measure); 

value poly.evaKmeasure.coeffs, to_number(bits) ) ; 

code :=* if not (measure .do.check) then No.check 

elsif value < measure .min. value then Small 
elsif value > measure .max. value then Large 
else In.range end; 

begin 

calibrated.measure^ (measure .name, value, code) 
end calibrate. measure; 


Figure 4: Calibration functions of telemetry processor. 


function calibrate. mas ter (master 

measures : 

return list.of .calibrated.me2usure 
result: list.of .calibrated.measure; 
begin 

for j in measures ’range loop 

result(j) :* calibrate. measure (master, measures(j)) ; 
end loop; 
return result; 
end; 


in master. frame; 

in list.of .measure.descript ion) 

is 


Figure 5: Main function in Ada. 
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A specification language is not useful unless there 
is a software development method that will include 
its use. We have presented two non-exclusive meth- 
ods: use the specifications as a prototype and trans- 
form the specification into an Ada program. Both 
approaches require the development of supporting 
tools. The language, along with its related meth- 
ods and tools, will provide for a practicai semi- 
formal software engineering environment. However, 
we have not tested extensively the use of functional 
languages in the specification of large scientific soft- 
ware in Ada. 
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Abstract 

This paper describes part of a multi-year study of 
software reuse being performed, at the Univefsity of Maryland. 
The part of the study which is reported here explores 
techniques for the transformation of Ada programs which 
preserve function but which result In program components 
that are more independent, and presumably therefore, more 
reusable. Goals for the larger study include a precise 
specification of the transformation technique and Its 
application in a large developmem orgwization. Expected 
results of the larger study, which are partialty covered here, 
are the identification of reuse promoters and inhibitors both 
in the problem space and in the solution space, the 
development of a set of metrics which can be applied to both 
developing and completed software to reveal the degree of 
reusability which can be expected of that software, and the 
development of guidelines tor both developers and reviewers of 
software which can help assure that the developed software 
will be as reusable as desired. 

The advantages of transforming existing software into 
reusable components, rather than creating reusabte 
components as an independem activity, include: 1) software 
development organizations often have an archive of previous 
projects which can yield reusable components. 2) developers 
of ongoing projects do not need to adjust to new and possibly 
unproven methods in an attempt to develop reusable 
components, so no risk or development overhead is introduced. 
3) transformation work can be accomplished in parallel with 
line developments but be separately funded (this is 
particularly applicable when software is being developed tor 
an outside customer who may not be willing to sustain the 
additionai costs and risks of developing reusable code), 4 ) the 
resulting components are guaranteed to be relevant to the 
application area, and 5) the cost is low and controllable. 


Introduction 

Broadly defined, software reuse includes more than the 
repealed use of particular code modules. Other life cycle 
products such as specifications or test plans can be reused, 
software development processes such as verification 
techniques or cost modeling methods are reusable, and even 
intangible products such as ideas and experience contribute to 
the total picture of reuse (1,2]. Although process and tool 
reuse is common practice, life cycle product reuse is still in 
its infancy. Ultimately, reuse of earty lifecycle products 
might provide the largest payoff. For the near term, however. 


gains can be realized and further work can be guided by 
understanding how software can be developed with a minimum 
of newly-generated source lines of code. 

The work covered in this paper Includes a feasibility study 
and some examples of generalizing, by transforming, software 
source code after it has been initially developed, in order to 
improve its reusability. The term software reclamation has 
been chosen for this activity since it does not amount to the 
development of but rather to the distillation of existing 
software. (Reclamation is defined in the dictionary as 
obtaining something from used products or restoring 
something to usefulness [3].) By exploring the ability to 
modify and generalize existing software, characterizations of 
that software can be expressed which relate to its reusability, 
which in turn is related to its maintainabiUty and portability. 
This study includes applying these generalizations to several 
small example programs, to medium sized programs from 
different organizations, and to several fairly large programs 
from a single organization. 

Earlier work has examined the principle of software 
reclamation through generic extraction with small examples. 
This has revealed the various levels of difficulty which are 
associated with generalizing various kinds of Ada dependencies. 
For example, it is easier to generalize a dependency that exists 
on encapsulated data than on visible data, and it is easier to 
generalize a dependency on a visible array type than on a 
visible record type. Following that work, some medium -sized 
examples of existing software were analyzed for potential 
generalization. The limited success of these efforts revealed 
additional guidelines for development as well as limitations of 
the technique. Summaries of this preceding work appear in 
(he following sections. 

Used as data for the current research is Ada software from 
the NASA Goddard Space Flight Center which was written over 
the past three years to perform spacecraft simulations. Three 
programs, each on the order of 100.000 (editor) lines, were 
studied. Software code reuse at NASA/GSFC has been practiced 
for many years, originally with Fortran developments, and 
more recently with Ada. Since transitioning to Ada. 
management has observed a steadily increasing amount of 
software reuse. One goal which is introduced here but which 
will be addressed in more detail m the larger study is the 
understanding of the nature of the reuse being practiced there 
and to examine the reasons for the improvement seen with Ada. 
Another goal of this as well as the larger study is to compare 
the guidelines derived from the examination of how different 
programs yield to or resist generalization. Several questions 


8th Annual National Conference on Ada Technology 1990 


5-21 


6109 



are considered through this comparison^^includlng the 
universality guidelines derived from a single program and 
whether the effect of (he application domain, or problem 
space, on software reusability can be distinguished from the 
effect of the implementation, or solution space. 

Superficially, therefore, this paper describes a technique 
for generalizing existing Ada software through the use of the 
generic feature. However, the success and practicaiity of this 
technique is greatly affected by the style of the software being 
transformed. The examination of what characterizations of 
software are correlated with transformabiiity has led to the 
derivation of software development and review guidelines, it 
appears that most, if not all. of the guidelines suggested by 
this examination are consistent with good programming 
practices as suggested by other studies. 


The Basic Technioua 

8y studying the dependencies among software elements at 
(he code level, a determination can be made of the reusability 
of those elements in other contexts. For example, if a 
component of a program uses or depends upon another 
component, then it would not normally be reusable in another 
program where that other component was not also present. On 
the other hand, a component of a software program wnich does 
not depend on any other software can be used, in theory at 
least, in any arbitrary context. This study concentrates only 
on the theoretical reusability of a component of software, 
which is defined here as the amount of dependence that exists 
between that component and other software components. Thus, 
it is concerned only with the syntax of reusable software. It 
does not directly address issues of practical reusability, such 
as whether a reusable component is useful enough to encourage 
other developers to reuse it instead of redeveloping its 
function. The goal of the process is to identify and extract the 
essential functionality from a program so that this extracted 
essence is not depemtent on external declarations, information, 
or other knowledge. Transformations are needed to derive 
such components from existing software systems since 
inter>component dependencies arise naturally Irom the 
customary design decomposition and implementation processes 
used for software development. 

Ideal examples of reusable software code components can 
be defined as those which have no dependencies on other 
software. Short of complete indeperxlence. any dependences 
when do exist provide a way of quantifying the reusability of 
(he components. In other words, the reusability of a 
component can be thought of as inversely proportional to the 
amount of external dependence required by that component. 
However, some or all of that dependence may be removable 
through transformation by generalizing the component. A 
measure of a components dependence on its externals which 
quantifies the difficulty of removing that dependence through 
transformation and generalization is siightJy different from 
simply measuring the dependence directly, and is more 
specifically appropriate to this study. The amount of such 
transformation constitutes a useful indication of the effort to 
reuse a body of software. 

Both (he transformation effort and the degree of success 
with performing the transforms can vary from one example to 
me next. The identification of guidelines for developers and 
reviewers was made possible by observing whai promoted or 
impeded the transformations. These guidelines can also help m 
the selection of reusable or transformable parts Irom existing 


software. Since dependencies among software components can 
typically be determined from the software design, many of the 
guidelines apply to the design phase of the life cyde, allowing 
I earlier analysis of reusability and enabling possible 
corrective action to be taken before a design is implemented. 
Although the guidelines are written with respect to the 
development and reuse of systems written in the Ada language, 
since Ada is the medium tor this study, most apply in general 
to software development in any language. 

One measure of the extent of the transformation required 
is the number of lines of code that need to be added, altered, or 
deleted (4|. However, some modifications require new 
constructs to be added to the software while others merely 
require syntactic adjustments that could be performed 
automatically. For this reason, a more accurate measure 
weighs the changes by their difficulty. A component can 
contain dependencies on externals that are so intractable that 
removing them would mean also removing all of the useful 
functionality of the component Such transformations are not 
cost-effective. In these cases, either the component in 
question must be reused in conjufKtton with one or more of the 
components on which it depends, or it cannot be generalized 
into an indopendandy reusable one. Therefore, for any given 
component, there is a possibility that it contains some 
dependencies on externals which can be eliminated through 
transformation and also a possiblltly that it contains some 
dependencies which cannot be eliminated. 

To guide the transformations, a model is used which 
distinguishes between software tonctton and the declarations 
on which that function is perfonned. In an object-oriented 
program (for here, a program which uses data abstraction), 
data dectarations and associated functionality are grouped into 
the same component. This component itself becomes the 
declaration of another object This means the function / 
declaration distinction can be thought of as occurring on 
multiple levels. The internal data dedarattons of an object can 
be distinguished from the constmetion and access operations 
supplied to external users of the object, and the object as a 
whole can be distinguished from its external use which applies 
additional function (possibly establishing yet another, higher 
level object). The distinction between functions and objects is 
more obvious where a program is not object-oriented since 
declarations are not grouped with their associated 
functionality, but rather are established globally within the 
program. 

At each level, declarations are seen as application-specific 
while the functions performed on them are seen as the 
potentially generalizable and reusable parts of a program. 
This may appear backwards initially, since data abstractions 
composed of both dedarattons arto functions are often seen as 
reusable components. However, for consistency here, 
functions and dedarattons within a data abstraction are viewed 
as separable in the same way as funatons which depend on 
declarations contained in external components are separable 
from those declarations, in use. the reusable, independent 
functional componen‘s are composed with appiicaiion- specific 
dedarattons to form objects, which can further be composed 
with other independent functtonal components to implement an 
even larger ponton of the overall program. 

Figure 1 shows one way of representing this. All the ovals 
are objects. The dark ones are primitives which have 
predefined operations, such as integer or Boolean. The white 
ovals represent program- supplied funciionality which is 
composed with their contained objects to form a higher level 
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object. The intent of the model is to distinguish this program* 
specific functionality and to attempt to represent it 
independently of the objects upon which it acts. 



Figure 1. 

Some Ada which might be represented as in the above 
figure might be: 


paclcage Counter is — resulting Object 

procedure Resec; -- applicable function ... 

procedure Increment; 
function Current^Value return tracuraX; 
end Counter; 

package body Counter is 

Count : Katurai :* 0; — Simple Object 

procedure Reset is 

begin 

Count 0; 
end Reset; 

procedure Increment Is 
begin 

Count :• Count + L; 
end Increment; 

function Current^Vaiue return ^ratural is 
begin 

return Count; 
end Currant_Vaiue; 
end Counter; 


package Hax_count is — resulting Object 
procedure Reset; — applicable function ... 

procedure Increment; 

function Current^Vaiue return Natural; 
function Max return Natural; 
end Max_Count; 

with Counter; 
package body Max^Count is 
Max_Vai ; Natural 0; — additional object 

procedure Reset is 
begin 

Counter . Reset; 
end Reset; 

procedure Increment is 
begin 

Counter. Increment; 

Lf Max^Vai < Counter . Curtent_^Vaiue then 
Maxj^Vai Counter .Cur rent_Vaiue ; 

end if; 

end Increment; 

function Current_Vaiue return Natural xs 
begin 

return Counter . Cur rent _Vaiue; 
end Current Value; 


function Max return Natural is 
begin 

return Max^Vai; 
end Max; 
end Max Count; 


In this example, the objects a/e properly encapsulated, 
though, they might not have been. If. for example, the simple 
objects were declared in separate components from their 
applicable functions, the result could have been the same 
{although the diagram might look different), in actual 
practice, Ada programs are developed with a combination of 
encapsulated object-operation groups as well as separately 
declared object-operation groups. Often the lowest levels are 
encapsulated while the higher level and larger objects tend to 
be separate from their applicable function. Perhaps in the 
ideal case, all objects would be encapsulated with their applied 
function since encapsulation usually makes the process of 
extracting the functionality at a later time easier. This, 
therefore, becomes one of the guidelines revealed by this 
model. 

if the above example were transformed to separate the 
functionality from each object, the following set of components 
might be derived: 

g«n«ric 

type Count^Objact is (<>) ; 
packaqr* GanjCountar is — resulting object 
procadura Rasat; — applicable function ... 

pcocadura Incramant; 

function Currant_Vaiua return Count_Ob jacc; 
and Gan_Countar; 

package body Gan^Countec is 

Count : Count _Obj act — simple Object 

> Counc_Objcct ' First ; 
procedure Rasat is 
begin 

Count Count_Objecc ‘First ; 

end Rasat; 

pcocadura Increment is 
begin 

Count Count^Object ' Succ (Count); 

end Increment; 

function Current _Vaiua return Count_Object ls 
beq;.n 

return Count ; 
and Current__Vaiue ; 
end Gen_Counter; 

ganarxc 

type Count _Object is (<>) ; 
package Gen_Max_Count is -- resulting Object 
procedure Rasat; — applicable function ... 

procedure Increment; 

function Cur rant_Va Lua return Count _Object ; 
function Max return Counc^Ob jecc ; 
end Gen_Max_Count ; 

with Gen^Councec; 

package body Gen_Max_Count ls 

Max^^vai : Count_object — additional Object 

Count_Ob]ecc ‘First ; 
package Counter is 

new Gen Counter (Count__0b jecc ) ; 
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procaduce R«s«c is 
b«gln 

Counc«r . ; 

«nd R«s«t; 

pcoc«dur« Znerunanc is 
hmqta 

Counc«r. IncrttMnt; 

if Max_v«l < Councec.Curranr_Vslutt than 
ttax^Val : " Couacar . Cur rant^Vaiua ; 

and if; 

and Incranant; 

function Currant^vaiua ratum Macurai is 
bagin 

racurn Countac . Currant_Vaiua; 
and Currant_Vaiua; 
function Max cacum natural is 
bagin 

ratum Max^Vai; 
and Max; 

and Gan_Max__Counc ; 

with Gan^Max_^Count; 

procadura Max_Count_CJsar is 
pacicaga Max^Count is 

naw Gan_Hax_Count (Natural) ; 

bagin 

Max jCount . aasat ; 

MaxjCount . Incranant; 

and Hax_Count_Usar; 

Note that the end user obtains (he same functionailty that a 
user of Max.Count has. but the software now allows the 
primitive object Natural to be supplied externally to the 
algorithms that wiN apply to It. Further, the user could have 
obtained analogous functionality for any discrete type simply 
by pairing the general object with a different type (using a 
different generic instantiation). 

This model is somewhat analogous to the one used in 
Smalltalk programming where objects are assend)led from 
other objects plus programmer-supplied spedFics. However, 
it iS meant to apply more generally to Ada and other languages 
that do not have support for dynamic binding and full 
inheritance, features that are in general unavailable when 
strong static type checking is required. Instead, Ada offers the 
generic feature which can be used as shown here to partially 
olfsei the constraints imposed by static checking. 

Applying this model to existing soltware means that any 
lines of code which represent reusable functionakty must be 
parameterized with generic formal parameters in order to 
make them independent from their surrounding declaration 
space (it they are not aiready independent). Generics that are 
extracted by generalizing existing program units, through the 
removal of their dependence on external declarations, can then 
be offered as independently reusable components for other 
applications. 

Unfortunately, declarative dependence is only one of the 
ways that a program unit can depend on its external 
environment. Removir\g t.ia compiler-detectable declarative 
dependencies by producing a generic unit is no guarantee that 
(he new unit wilt actually be independent. There can be 
dependencies on data values that are related to values in 
neighboring software, or even dependencies on protocols ol 


operation that are followed at the point where a resource was 
originally used but which could be violated at a point of later 
reuse. (An example of this kind of dependency is desaibed in 
the Measurement section.} To be complete, the transformation 
process would need to identify and remove these other types of 
dependerwe as weH as the declarative dependence. Although 
guidelines have been identifled by this study which can reduce 
the possibility for these other types of dependencies to enter a 
system, this work only concentrates on mechanisms to 
measure and remove declarative dependence. 


Mn« F.amnl«. 

In a language with strong static type checking, such as Ada. 
any information exchanged between communicating program 
units must be of some type which is available to both units. 
Since Ada enforces name equivalence of types, where a type 
name and not just the underlying structure of a type 
introduces a new and distinct type, the declaration of the type 
used to pass information between units must be visible to both 
of those units. The user of a resource, therefore, is 
constrained to be in the scope of all type dedarations used in 
the interface of that resource. In a language with a fixed set of 
types this is not a problem since all possible types will be 
globally available to both the resource and its users. 
However, in a language which allows user-declared types and 
enforces strong static type checking of those types, any 
inter-component communication with such types must be 
performed in the scope of those programmer-defined 
declarations. This means that the couplirYg between two 
communicating components increases from data coupling to 
external coupling (or from level two to level five on the 
traditional seven-point scale of Myers, where level one is the 
lowest level of coupling) [5]. 

Consider, for example, project-specific type declarations 
which often appear at low, commonly visible levels in a 
system. Resources which build upon those declarations can 
(hen be used in turn by higher level application-specific 
components, tf a programmer attempts to reuse those 
intermediate-tevel resources in a new context, it is necessary 
(0 also reuse the low-level declarations on which they are 
built. This may not be acceptable, since combining several 
resources from different original contexts means that the set 
of low-level type declarations needed can be extensive and not 
generally compatible. This situation can occur whether or not 
data is encapsulated with its applicable function, but for 
clarity, and to contrast with the previous examples, it is 
shown here with the data and its operations declared 
separately. 

For example, imagine that two existing programs each 
contain one of the following pairs of compilation units: 

- First program contains first pair: 

package V 3 _l is 

type Variabie_String is 
record 

Oaca : String (1., 30); 

Length : Natural; 
end record; 

function Var iabie_$tring_r coni_U3et 

return Variabie^St ring; 

end V3 1 : 
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with Vs_l; 
package Pm_I ia 

type Phon«_Me33ag« l3 
record 

From : V 3 _l . Variabie_String; 

To : V 3 _l . Variibie^String; 

Data : Vs^L.Variable^String; 
end record; 

function Phone_Hes3age_^From_a3er 

return Phone_Me33age; 

end Pm_l.; 

- Second pfogram contains second pair: 
package VsJZ la 

type Variabie_String ia 
record 

. Data : String (1, .250) (othera->‘ 
Length; Matural 0; 
and record; 

function Variable's tring_Ftom_yaer 

return Variabie^String; 

end Va 2; 


the incompatibility between the underlying type declarations 
used by Mail Message and Phone^Message becomes a problem. 
One solution might be to use type conversion. However, 
employing type conversion between elements of the low level 
variable string types destroys the abstraction for the 
higher-level units. For instance, the user procedure above 
oouW be written as shown below, but exposir>g the detail of the 
implementation of the variable strings represents a poor, and 
possibly dangerous, programming style. 

with vs_l; 
with Pm 1; 
with Ma"2; 

procedure Type_Converslon_yser is 
Name : Vs_l . Variable_Strlng; 
pm : Pm_l. Phone Jfes sage 

Pm_l . Phone J4e3sage_From_User; 

m : Mn 2.Mail_Message 

»n_2 . Mail_Mes sage_F rom_User ; 

begin 

Name ;* Pm.To; 

14b. F rom. Data (1..80) :* Name. Data; 
rtn. From. Length Name. Length; 

end TypejConversion_User; 


with va_2; 
package Mni_2 ia 

type Mailjteaaage ia 
record 

From : Va_2 . Variable_String; 

To : Va_2 .Variable^String; 

Subject : Va_2 . Variabie_String; 

Text : V3*2.Variabie_String; 
end record; 

function Mail_Mea3age_Fron_Uaer 

~ return Mail^Meaaage; 


end MB 2; 


Now, consider the progfammar who is trying to reuse the 
above declarations in the same program. A reasonable way to 
combine the use of MaiLMessages with the use of 
Phone.Messages might seem to be as follows: 


with Va_l; 
with Pm_l; 
with Mra^2; 
procedure tJaec ia 

Name : Va I .Variable's t ring; 

Pm : Pm^l.Phone^Meaaage 

Pra^l .Phone^Me33age_From_U3ec; 

Mra : Mm_2 .Hail^Heaaage :• 

Mb_ 2 . Ma il_Mea aage_F rora_Uae r ; 

begin 

Kaiiie :* Pm.To; 

MB. From :• Name; — ///ega/ 
end Uaer; 

This will fail to compile, however, since the types Vs_i. 
Variable^Slring and Vs_2.Variable_String are distinct and 
therefore values of one are not assignable to objects of the 
other (the value of Name is of type Vs_1.Variable_String and 
the record component Mm. From is of type Vs_2. 
Variable^Slring). 

In the above example, note that the variable string types 
were left visible rather than made private to make it seem 
even more plausible for a programmer to expect that, at least 
logically, the assignment attempted is reasonable. However. 


Notice that we had to be careful to avoid a constraint error 
at the point of the da*a assignment. This is one example of how 
attempts to combine the use of resources which rely on 
different context declarations is difficult in Ada. 

Slatic type checking, therefore, is a mixed blessing, it 
prevents many errors from entering a software system which 
might not otherwise be delected until run time. However, it 
limits the possible reuse of a module if a spedfic declaration 
environment must also be reused. Not only must, the reuseo 
module be in the scope of those dedaraiions. but so must its 
users. Further, those users are forced to communicate with 
that module using the shared external types rather than their 
own, making the resource master over its users instead of the 
other way around. The set of types which facilitates 
communication among the components of a program, therefore, 
ultimately prevents most, if not all, of the developed 
algonthms from being easily used in any other program. 

This study refers to declarations sucrt as those of the above 
variable string types as contexts, and to components which 
build upon those dedaraiions and which are in turn used by 
other components, such as the above Mail_Message and 
Phone^Messaqe packages, as resources. Components which 
depend on resources are referred to as users. The above 
illustrates the general case of a context-resource-user 
relationship. It is possible lor a component to be both a 
resource at one level and also a context for a still higher-level 
resource. The dependencies among these three basic categones 
of components can be illustrated with a directed graph. Figure 
2 shows a graph of the kind of dependency illustrated in the 
example above. 


A resoufca does not always need lull type inlormaiion 
aooul me data it must access in order to accomplish its tasK. 
In me above examples, it would be possible lor me Mail and 
Phone message resources to implement iheir functions via the 
functions exported from the variable string packages without 
any furlher information about the structures of those lower 
level vanable string types. Sometimes, even less knowledge 
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of the structure or functionality of the types being 
manipuiated by a resource is required by that resource for it 
to accomplish its function. 


user 



A 

^ - A depends on B 
B 


Figure 2. 


A common example of a situation where a resource needs 
no structural or operational information about the objects it 
manipulates is a sin^ data base which stores and retrieves 
data but which does not take advantage of the information 
oontained by that data. It is possibla to write or transform 
such a resource so that the context it requires tite type of 
the object to be stored and retriaved) is suppM by lha users 
of that resource. Then, only the essential work of the module 
needs to remain. This 'essence only* principfe is the key to 
the transformations sought. Only the purpose of a module 
remains, wHh any details needed to produce the executing code, 
such as actual type declarations or specilic operations on (hose 
types, being provided later by the users of the resource. In 
languages such as Smalltalk which allow dynamic binding, this 
information is bound at run time. In Ada, where the compHer 
is obligated to perform alt type checking, generics are bound 
at completion time, eliminating a major source of nin time 
errors caused by attempting to perform inappropriate 
operations on an object. Sven though they are statically 
checked, however. Ada generics can often allow a resource to 
be written so as to free it from depending upon external type 
definitions. 

Using the following arbitrary type declaration and a 
simplified data store package, one possible transformation is 
illustrated. First the example is shown before any 
transformation is applied: 


context: 

package DecXa ia 

type Typ ia . . . - anything but limited private 

end Oecia; 

- resource: 

Decla; 

package Store ia 

procedure Put fObj in DecXa.Typ); 
procedure Gec_Laat tObj ; out Dfcia.Typ); 
end Store; 


package body Store ia 
Local : Decla. Typ; 

procedure Put {Obj : in Decla. Typ) ia 
begin 

Local :* Obj; 
end Put; 

procedure Get^Laac (Obj : out Decla. Typ) ia 
begin 

Obj Local; 
end Get^Last; 
end Store; 

The above resource can be transformed into the following 
one which has no dependencies on external declarations: 

- generalized resource: 
generic 

type Typ ia private; 
package General_Store ia 

procedure Put (Obj : in Typ) ; 
procedure Get^Laat (Obj : out Typ) ; 
end General__^Store; 

package body General_Store ia 
Local : Typ; 

procedure Put (Obj : in Typ) ia 
begin 

Local Obj; 
end Put; 

procedure Get_Laat (Obj : out Typ) ia 
begin 

Obj Local; 
end Get^Laat; 
end Generalist ore; 

Note that, by naming the generic formal parameter 
appropriately, none of (he Identlfters in the code needed to 
change, and the expanded names were merely shortened to 
their sin^ names. This mMmtzes the handling required to 
perform the transformation (although automating the process 
would make this an unimportant issuej. This transformation 
required the removal of the context clause, the addition of two 
lines (the generic part) and the shortening of the expanded 
names. The modification required to convert the package to a 
theoretically independent one constitutes a reusability 
measure. A user of the resource in the original form would 
need to add the following declaration in order to obtain an 
appropriate instance of the resource: 


package Store la new General_Store (Decla. Typ) ; 

Formal rules for counting program changes have already 
been proposed and validated (4|, and adaptations of these 
counting rules (such as using a lower handling value for 
shortening expanded names and a higher one for adding generic 
formats) are being considered as part of this work. 

The earlier example with the variable string types can 
also be generalized to remove the dependencies between the 
mail and phone message packages (resources) and the variable 
siring packages (contexts). For example, ignoring the 
implementations (bodies) of the resources, the following 
would functionally be equivalent to those examples: 
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Contexts, as before: 

package 

type VaclabXe_SCring ia 
record 

Oaca : String (I.. 80); 

Len : natural; 
end record; 

function Variable's trlng^From^User 

return Variable^String; 

end Vs^L; 

package Va_2 is 

type Variable^String ia 
record 

Data : String (1..2S0) (othera->' 

Len : natural :«• 0; 
end record; 

function Vatiable_String_FromJJser 

return Variable^String; 

end Vs_2; 


~ Resoiirces. wbicb no lonoer depend upon 
- the above context dedarations: 
generic 

type Coo^ponent ia private; 
pacicage Gen_Pm_X ia 
type PhoneJMasage ia 
record 

From : Co^sonent; 

To : Coeiponent; 

Data : Co^onent; 
end record; 

function Phone_Measage_From_tJaer 

return Phone_Meaaage; 

end Gen___^Pm_l; 
generic 

type Component ia private; 
package Gen_Me_2 ia 
type Mail_Heaaage ia 
record 

From : Con^nent: 

To : Component ; 

S ub j : Component ; 

Text : Cofi^nent ; 
end record; 

function Mail_Meaaage_From_CJaer 

~ return Kail^Meaaage; 

end Gen_Mm_2; 

Now, the programmaf who is trying to reuse the above 
decla/alions by combining the use of Mail.Messages with the 
use of Phone.Messages has another option. Instead ot trying to 
combine both contexts, just one can be chosen (in this case. 

Vs_2): 

wLtn V3_2; 
with Gen_Pm_l; 
wicn Gen_Mm_2; 
procedure Uaer ia 
package Ptn^l ia ne« 

* Gen_Pra_l <Va_2 . Vaciabie^Strinq) ; 
package Mm_2 ia new 

Gen_rtBi_2 (Va_2 . Variable_Strinq) . 
Name : va 2 . VariabIe_St ring; 


Pm : Pm^l .Phone^Meaaage :• 

Pm^l .Phone^Heasage^Fcom^^Uaer; 

Mm : Mm_2 .Mail_Meaaaqe :• 

Mm_2 . Ma i l_Meaaage_F com_Uae r ; 

begin 

Name Mb. From; 

Pm. To :■ Name; — OOW OK 
end Daer; 

An addHionai complexity is ret^ired for this example. The 
resources must be able to obtain component type values from 
which to construct maM and phone massages. Although this is 
not obvious from the specifications only, it can be assumed 
that such functionality must be available in the body. This can 
be done by addng a generic formal function parameter to the 
generic parts, requiring the user to supply an additional 
parameter to the instantiations as well: 


goneric 

typ« Compcnenc Is private; 
with function Component^From^Uaer 
return Component; 

-> parametetiess for simplicity 

package Gen^Pm^l is 
type Phone_Mes9«ge is 
record 

From : Coo^nent; 

To : Component; 

Data : Component ; 
end record; 

function Phone^Message_From_aser 

. return Phone_Me3sage; 

end Gen_Pm_l; 

Although the above examples show the context, the 
resource, and the user as library level units, declaration 
dependence can occur, and transformations can be applied, in 
situations where the three components are nested. For 
example, the resource and user can be co^resident m a 
declarative area, or the user can contain the resource or vice 
versa. 

This reiterates the earlier daim that, at least for the 
purpose of this model, it does not matter if the data is 
encapsulated with its applicable function, it just makes it 
easier to find if it is. In the programs studied, the lowest level 
data types, which were often property encapsulated with their 
immediately available operations, were used to construct 
h^her level resouri^ speciRc to the problem being solved, it 
was unusual for those resources to be written with the same 
level of encapsulation and independence as the lower level 
types, and this resulted in the kind of context -resource^user 
dependendes illustrated above. 

For example, in the case of the generalized simple data 
base, the functionality of the data appears m the resource 
while the declaration of it appears in the context. The only 
place where the higher-level object comes into existence is 
inside the user component, at the point where ihe instantiation 
is declared, if desired, an additional transformation can be 
applied to rectify this problem of the apparent separation of 
(he obiect from its operations, instead of leaving the 
instantiation of (he new generic resource up to (he client 
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software, an intermediate package can be created whicft 
combines the visibility of the context declarations with 
instantiations of the generic resource. This package, then, 
becomes the direct resource for the client software, 
introducing a layer of abstractfoii that was not present in the 
original (non-general) structure. 

For example, the following transformation to the second 
example above combines the resource QeneraIjSlore with the 
context of choice, type Typ from package Oecis. The 
declaration of the package Object performs this service. 

gvnaric 

type Typ is private; 
paclcaqa Gan«ral_Stora is 

pcoca<lur« Put (Ob j : in Typ) ; 
procedure Get_Last (Ob j : out Typ) ; 
end Gene re instore; 

pacXaqe Oecia is 
type Typ is . . . 
end Oecis; 

vith Oecis; 

with Genere instore; 

pacJtaqe Object is 

subtype Typ is Oecis. Typ; 
package Store is new General^Store (Typ) ; 
procedure Put (Obj : in Typ) 
renames Store. Put; 
procedure Get^Last (Obj : out Typ) 
renames Store. Get_^Last; 
end Object; 

with Object ; 
procedure Client is 
Item : Object. Typ; 
begin 

Object. Put (Item); 

Ob ject . Get_I*est ( Item) ; 
end Client; 

Mots that no body tor packaga Object is reguirad using tha 
style shown, if it ware prafarabla to leave tha implamentation 
of Object flaxibla. so that users would not need to be 
recompiled if the context used by the instantiation were to 
change, the context clauses and the instandatton could be made 
to appear only in the body of Object. An altemate. admittedly 
more complex, example is shown here which accomplishes 
this flexibility: 

pac)caga Objact is 

type Typ is private; 
function Initial return Typ; 
procedure Put (Obj ; in Typ) ; 
procedure Get^Laat (Obj in Typ) ; 
private 

type Designated; 
type Typ is access Designated; 
end Object; 

with Dec is; 

with Genera i_S to re; 

package body Object is 


type Typ is new Oecis. Typ; 
function Initial return Typ is 
begin 

return new Designated; 
end Initial; 

pacicage Store is new General^Store (Typ) ; 

procedure Put (Obj : in Typ) is 

begin 

Store. Put (Obj. ail); 
end Put; 

procedure Get_Last (Obj : in Typ) is 
begin 

Store. Get_Last (Obj. all); 
end Get_Last; 
end (^ject; 

in tha allemate exampla, note that the parameter mode for 
the Get.Last procedure needed to be changed to allow the 
readtog of the designated object of the actual access parameter. 
Also, a simple initializatton function was supplied to provide 
the cUent with a way of passing a non-null access object to the 
Put and Get^Last procedures. Nonnally. there would already 
be initializatton and constructor operations, so this addittonal 
operation would not be needed. The advantage of this 
alternative is that the implementation of the type and 
operations can change without distufbing the client software. 
However, the first alternative could be changed in a 
compUatton-compatible way. such that any client software 
would need recompilation but no modification. 

It is also possible to provide just an. instantiation as a 
library unit by itself, but this requires the user to acquire 
independently the visibility to the same context as that 
instantiation. This solution results in the reconstruction of 
the original situation, where the instantiation becomes the 
resource dependent on a context, and the user depends on both. 
The import^ difference, however, is that now the resource 
(the instantiation) is not viewed as a reusable component, it 
becomes application-specific and can be routinely (potentially 
automatically) generated from both the generalized reusable 
resource and the context of choice, while the generic from 
which the instantiation is produced remains the independent, 
reusable component The advantage of this structure lies in 
the abstraction provided for the user component which is 
insulated from the complexities of the instantiation of the 
reusable generto. Since the result is similar to the initial 
architecture, the overall software architecture can be 
preserved while utilizing generic resources. The following 
illustrates this. 

package Deela la 
type Typ la ... 

•nd Oecia; 

generic 

type Typ ia private: 
package Genera i_S to re ia 

procedure Put (Obj : in Typ) ; 
procedure Gec_t.a3t (Obj ; out Typ) ; 
and Genera instore; 

with Oecia; 

with General^Store; 

Package Object ia new Generai^Stoce (Decis , Typ> ,• 
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with D«cl3; 
with Object; 
procedure Client is 
Item : DecXs.Typ; 
begin 

Object. Put (Item) ; 

Object .Get_Lest (Item); 
end Client; 

By modifying the generic resource to *pass through* the 
generic formal types, the user's reliance on the ooftiext can be 
removed: 

generic 

type Gen_Typ is private; 
package Generai^Store is 

subtype Typ is Gen^Typ; — pass the type through 
procedure Put {Ob j : in Typt ; 
procedure Get_Last (Ob j : out Typ) ; 
end Generalist ore; 

package Oecls is 
type Typ is . . . 
end Oecls; 

with Oecls; 

with GeneraliStore; 

package Object is new GeneraliS tore (Oecis .Typ) ; 

with Object; 
procedure Client is 
Item : Object.Typ; 
begin 

Object. Put {Item) ; 

Object . GetiLast { Item) ; 
end Client; 


Mflasufttmaoi 

In the above examples, the oontexi components were never 
modified. Resource components were modified to eliminate 
their dependence on context components. User components 
were modified in order to maintain their functionality given 
the now general resource components, typically by defining 
generic actual parameter objects and adding an instantiation. 
In the case of the encapsulated instantiations, an intermediate 
component was intror&jced to free the user component of the 
complexity of the instantiation. It is the ease or difficulty of 
modifying the resource components that is of primary interest 
here, and the measurement of this modification effort 
constitutes a measurement of the reusability of the 
components. The usability of the generalized resources is also 
of interest, since some may be difRcuit to instantiate. 

Considenng the above examples again, the simple data base 
resource Store required the removal of the context clause and 
the creation of a generic part (these being typical 
modifications for almost alt transformations of this kind), in 
addition, the formal parameter types for the two subprograms 
were changed to the generic formal private type, causing a 
change to both the subprogram specification and body. No 
further changes were required. 


•• original: 

wich Oecls: 
package Store is 

procedure Put (Obj in Oecls. Typ); 
procedure Get^Lest (Obj out Oecls. Typ); 
end Store; 

package body Store is 
Local ; Oecls. Typ; 

procedure Put (Obj : in Oecls. Typ) is 
begin 

Local Obj; 
end Put; 

procedure Get^Last (Obj : out Oecls. Typ) is 
begin 

Obj Local; 
end Get_Last; 
end Store; 


- transformed: 

generic 

type Typ is private; “ Change 

package General^Store is 

procedure Put (Obj : in Typ) ; - Change 

procedure Get^Last (Obj ; out Typ) ; - Change 

end General_3tore; 

package body Genera 1_S tore is 
Local ; Typ; 

procedure Put (Obj : in Typ) is - Change 

begin 

Local Obj; 
end Put; 

procedure Get_Last (Obj: out Typ) is -change 
begin 

Obj Local; 
end Get^Last; 
end Genera Instore; 

The Phooe.Message and Mail.Message resources required 
the deletion of the context dause. the addition of a generic part 
consisting of a formal private type parameter and a formal 
suborogram parameter, and the replacement of three 
occurrences (or tour, in the case of Maii^Message) of the type 
mark Vs.i.Variable.String with the generic formal type 
(Component 

- original: 

with Va^l; 
packsgo Pm_l is 

typo Phono_Mossogo is 
cococd 

From : Vs_l . Vsci*bio_String; 

To : Vs_l .V«ci*blo_String; 

Osts : Vs'l.Vstisblo^String; 

«nd tocord; 

£unccion Phooo_Mos3*go___Fcom__Usor 

return Phone_M«3sag«; 

«nd Pm_ I ; 

- transformed: 

qonoric 

typo Conponont is private; -Change 

with function Conponont _From_Us or 

cocurn Component; -change 
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packaq* G«n_Pm_I ia 
cyp« ia 

c«cord 

fcon : Conpon«nt; 

To : Covponont ; 

Oaca : Conponont; 

•nd roeocd; 
funecion PhonajMossaqv^^rcoa^Usor 

c%ZMzn ?hoam^jhmss»gm : 

•nd G«n_Pm_l; 

G«n«raJizing tti« bodies of Gen.Pm_l and Qen_Mm_2 
would involve replacing any calls to the Variable's tiing. 
From^User functions with calls to the generic^ form^ 
Component.From^User function. In the case of the simple 
bodies shown before, this would require three and four simple 
substitutions, for Qen.Pm_i and Gen_Mm_2, respectively. 

In addition to measuring the reusability of a unit by the 
amount of transformation required to maximize its 
independence, reusabifity can also be gauged by the amoum of 
residual dependency on other units which cannot be 
eliminated, or which is unreasonably difRcuit to eiiminato. by 
any of the proposed transformations. For any given unit, 
therefore, two values can be obtained. The first reveals the 
number of program changes which would be required to 
perfonn any app fl cable transformations. The second indicates 
the amount of dependence wnich would remain in the unit even 
after it was transformed. The original units in the examples 
above would score high on the first scale since the handting 
required for its conversion was negligible, implying that its 
reusability was already good (Le., it was already independent 
or was easy lo make independent of external declarations). 
After the transformation, there remain no latent dependencies, 
so the transformed generic would receive a perfect reusability 
score. 

Note that the ob^ of any reusability measurement, and 
therefore, of any transformations, need not be a single Ada 
unit. If a set of library units were intended to be reused 
together then the metrics as well as the transformations could 
be applied to the entire set. Whereas there might be 
substantial interdependence among the untls within the set. it 
still might be possible to eliminate all dependencies on 
external declarations. 

in the above examples, one reason that the transformation 
was trivial was that the only operation performed on objects 
of the external type was assignment (except lor the mail and 
phone message examples). Therefore, it was possible to 
replace direct visIbiUly to the external type definition with a 
generic formal privata type. A secornf example illustrates a 
slightly more difficult transformation which includes more 
assumptions about the externally declared type. In the 
following example, indexing and component assignment are 
used by the resource. 

Before transformation: 

•• context 

Acr is 

typm rt«tn_Array is 

array (Inceqar ranqe <>l qC Nacurai; 

«nct Arr; 


•• resource 
with Arr; 

procadura Clear (It«m : out Arr . Item^Array) ia 
begin 

Cor I in Item 'Range loop 
Item U) 0; 

•nd loop; 
end Clear; 

- user 

with Arr, Clear; 
procedure Client is 

X : Arr.Item^Array U..10); 
begin 

Clear (X) ; 
end Client; 

After transformation: 

- context (same) 
package Arr is 

type IteB_Array is 

array (Integer range <>> of natural; 

end Arr; 

- generalized resource 
generic 

type Component is range <>; 
type Index is range <>; 
type Gen^Array is 

array (Ixidex range <>) of Component; 
procedure GenjClear (Item : out Gen_Array) ; 
procedure GenjClear (Item ; out Gen_Array) is 
begin 

Cor I in I tern 'Range loop 
Item (I) 0; 

end loop; 
end Gen^Clear; 

-* user 

with Arr, Gen_Clear; 
procedure Client is 

X : Arr.Item^Array (I.. 10); 
procedure Clear is new Gen_Cleac 

(Natural, 

Integer, 

Arr . Item_Array) ; 

begin 

Clear (X) ; 
end Client; 

The above transformation removes compilation dependen- 
cies. and atiows the generic procedure to describe ris essential 
function without the visibility of external declarations. As 
before, an intermediate effect could be created to free the user 
procedure from the chore of instantiating a Clear procedure, 
which requires visibilily to both the context arxj the resource. 
However, it also illustrates an important additional kind of 
dependence which can exist between a resource and its users, 
namely information dependence. 

in the previous example, the Hteral value 0 is a due to the 
presence of information that is not general. Therefore, the 
following would be an improvement over the transformation 
shown above: 


- change 

- change 

- change 
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ganaric 

typa Conponanc l3 ranga <>; 

&ypa Indax is ranga <>; 
cypa Gan_Array is 

array <Indax ranga <>> of Cooponanc; 
Init^Val : Componanc Conponant *rirat; 
procadura Gan_Claar (Itam : out Gan_\rray) ; 
procadura Gan_Claar (Itam : out <^an_^rray) is 
bagin 

for I in ttam*!Unga loop 
Itam (I) Init^Val; 
and loop; 
and Gan^^Claar; 

Note that the last transtormatton aitows the user to suppty 
an initial value, but also provides the lowest value of the 
component type as a detault. An additional refinement would be 
to make the component type private which would mean that 
Init.Val could not have a default value. Information 
dependencies such as the ^ one illustrated here are harder to 
detect than compilation dependencies. The appearance of 
literal values in a resource is often an indication of an 
information dependence. 


A thlid form of dependence. caOed protocol dependence, has 
also been identified. This occurs when the user of a resource 
must obey certain rules lo ensure that the resource behaves 
property. For example, a stack which is used to buffer 
information between other users could be implemented in a 
not'SO^abstract fashion by exposing the stack array and top 
pointer directly. In this case, all users of the stack must 
follow the same protocol of decrementing the pointer before 
popping and incrementing after pushing, and not the other way 
around. Beyond the recognition of U. no additional treatment 
of this lomt of dependence between components will appear in 
this study. 


Pormalmno tha Transformatjons 

The following is a formalization of the objectives of 
transformations which are needed to remove declaration 
dependence. 

1. Let P represent a program unit. 

2. Let 0 represent the set of n object declaraiions. dt .. dn. 
directly referenced by P such that d{ is of a type declared 
externally to P. 

3. Let 0$ .. On be sots of operations where is the set of 
operations applied to di inside P. 

4. P is completely transformable if each operation in each of 
the sets. Oi .. On can be replaced with a predefined or genenc 
formal operation. 

The earlier example transformalion is reviewed in the 
context of these definitions: 


1 . Let P represent a program unit. 

P • procedure Clear (item : out Arr.ltem_Array) is .. 

2. Let 0 represent the set of n object declarations. .. dn. 
drectly referenced by P such that di is of a type declared 
externally to P. 

D - { Arr.Iiem^Arriy ) . 

3. let Oi .. On be sets of operations where 0| is the sal of 
operations applied to inside P. 

Oi « 

( indexing by integers, integer assignment to oomponems } 

4. P is completely transformable if each operation in each of 
the sets. Oi .. On can be replaced with a predefined or generic 
formal operation. 

Irxjexlng can be obtained through a generic formal array 
type. Although no constraining operation was used, the formal 
type could be either constrained or unconstrained since the 
only declared object is a formal subprogram parameter. 
Since component assignment is required, the component type 
must not be llmllod. Therefore, the following generic formal 
parts are possible; 

« 

type Conponent is range <>; 
type Index is range <>; 

followed by either: 

cype Gen^Xrray is array (Index) of CotRponent; 
or: 

type Gen^Array is 

array (Index range <>) of Component; 

Notice that some operations can be replaced with generic 
formal operations more easily than others. For example, 
direct access of array structures can generally be replaced by 
making the array type a generic formal type. However, direct 
access into record structures (using *dot* notation) 
complicates transformations since this operation must be 
replaced with a user-defined access function. 


Aoplication to gxte mal Software 
Medium-Sized Projects 

To test the Feasibility of the transformations proposed, a 
6, 000- line Ada program written by seven professional 
programmers was examined for reuse transformation 
possibilities. The program consisted of six library units, 
ranging in size from 20 to 2.400 lines. Of the 30 
theoretically possible dependencies that could exist among 
these units, ten were required. Four transformations of the 
sort described above were made to three of the units. These 
required an additional 44 lines of code (less than a 1% 
increase) and reduced the number of dependencies from ten to 
five, which is the minimum possible with six units. Using one 
possible program change definition, each transformation 
required between two and six changes. 
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A fifth modification was mada to detach a nested unit from 
iis parent. This required the addition of IS lines and resuited 
in a total of seven units with the minimum six dependencies. 
Next, two other functions were made independent ol the other 
units. Unlike the previous transformations which were 
targeted for later reuse, however, these transformations 
resulted in a net reduction in code since the resulting 
components were reused at multipie points within this 
program. SuOstantiai infor ma tion dapendency which would 
have impaired actual reuse was identified but remained whhin 
the units, however. 

A second medium-sized project was studied which 
exhibited such a f^h degree of mutual dependence between 
pairs of library units that, instead of selecting smaller units 
for generalizations, the question of non-hierarchlcal 
dependence was studied at a system level. The general 
conclusion from this was that loops in the dependency 
structure (where, for example, package A is referenced from 
package body B and package B is referenced from package body 
A) make generatization of those components difficult. The 
program was instead analyzed for possible restructuring to 
remove as much of the blKlirectfonal dependence as practical. 
This was partlalty successful and suggests that this sort of 
redesign might appropriatety precede other reuse analyses. 

Tha NASA ProiectS 

Currently, the research project is examining several 
spacecraft flight simulation programs from the NASA Goddard 
Space Right Center. These proy ams are each more than 
100.000 editor lines of Ada. They have been developed by an 
organization that originally developed such simulators in 
R)rtran and has been transitioning to the use of Ada over the 
past several years. Becausa alt the programs are in the same 
application domain and were developed by the same 
organization there is oonsiderabie opportunity for reuse. In 
the past, the devetopment organization reported the ability to 
reuse about 20% of earner programs when a new program 
was being developed in Fortran. However, since becoming 
familiar with Ada. the same organizatfon is now reporting a 
70% reuse rate, or better. 

After gaining an understanding of the nature of the reuse 
accomplished in Fwtran and later in Ada, and how similar or 
different reuse in the two languages was, we would like to test 
several theories about why the Ada reuse has been so much 
greater. We already know that the reuse is accomplished by 
modifying earlier componems as required, and not. in general, 
by using existing software verbatim. Because of this reuse 
mode, one theory we wiS be testing is thst the Ada proyams 
are more reusable simply because they are more 
understandable. 

For the current study, the programs were studied to 
reveal opportunities to extract generic components which, had 
they been avaiiabit when the programs were being developed 
originaliy. could have been reused without modification. 
There is an additional advantage to working with this data, 
however, since, as mentioned above, the several programs 
already exhibit significant functional similarities which can 
be studied for possible generalization. In other words, 
whereas the initial discussion of generic extraction has 


focussed on attempts to oomplalely free the essential function 
of a component from its static dedaration context, this data 
gives examples of similar components in two or more different 
program contexts and therefore allows us to study the 
possibiUty of freeing a component from only its program- 
specific context and not from any context which remains 
constant across programs. 

This gives rise lo the notion of domain-specific generic 
extraction as opposed to domain-independent generic 
extraction. Given the probtema associated with extracting a 
completely general component, as examined earlier, a case can 
be made to generalize away only some of the dependence, 
leaving the rest in place. The additional problem, then, 
becomes how to determine what dspendence is permissible and 
wnat should be removed. T?ie psrmissibie dependence would be 
common across projects in a certain domain, and would 
therefore be domaln-spedfio while the dependence to be 
removed would be the problem-specific context. When 
reused, then, these components would have their problem- 
specific context supplied as genaric act u al parameters. 

This is currently a largely manual task, since the 
programs must be compared to find corresponding 
furtctionattly and then examined to determine the intersection 
of that functionality, interestingly, on the last project the 
devetopers themselves have also been devising generic 
oomponents which are instantiated only one time within that 
program. This impllad to us that some effort was being spent 
to make components which might be reusable with no. or 
perhaps only very little, modiffcarto n in the next project. We 
have oonffrmed with the developers that this Is in fact the case. 
By comparing the results of our generalizations with those 
done by the developers, we find that ours have much more 
complex generic parts but correspondingly much less 
dependence on other software. This is a reasonable result, 
since the devetopers already have some idea about the context 
for each reuse of a given generic; what aspects of that context 
are likely to chvige from project to project and what aspects 
are expected to remain oonstM across several programs. The 
program-spectfie context, only, appears in the generic parts 
of the generics written by the developers, while our 
generalizations have generic parts which contain declarations 
of types and operations wMch apparently do not need to change 
as long as the problem domain remairts the same. In other 
words, when our generic parts are devised by analyzing only a 
single instance of a component, we cannot distinguish between 
program-specific and domain-specific generalizations. 

One interesting question we would like to answer is 
whether we can derive the generic part that makes the most 
sense within this domain by comparing similar components 
from different programs and generalizing only on their 
differences, leaving the software in the intersection of the 
components unchanged. In this way. a component would be 
derived which would not be completely independent but. like 
the developer-written generics, would be sufficiently 
independent for reuse in the domain. Then, a comparison with 
the generics developed within the orcanization would be 
revealing. If the generics are similar then our process might 
be useful on other parts of the software that have rx3t yet been 
generalized by the developers. However, if they differ 
greatly, it would be useful to characterize that difference and 
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understand what additional knowledge must be used in 
generalizing the repeated software. Unfortunately, there is 
not enough reuse of the developer's generics yet to make this 
final comparison but a protect is currently in progress which 
should supply some of this data. 

The following example illustrates the complexity of the 
generic parts which were required to completely isolate a 
typical unit from its context. Here, the procedure 
ChecK.Header was removed from a package body and 
generalized to be able to stand alone as a library level generic 
procedure. 

ganttcic 

type Tiaw is privata; 

Lypa Duration is digits <>; 
with function Enabla return Boolean; 
type Hd_Rac_Typa is private; 
with procedure Set^Start 

<H : in out Hd_Rec_Type; To : Duration) ; 
with function Get_Start 

(K : Hd_RecJType) return Duration; 
with procedure Set^Stop 

(H : in out Hd^Rec^Type; To : Duration) ; 
with function Get^Stop 

(H : Hd_^Rec_Type) return Duration; 
type Real is digits O; 
with function Get_Att_Int 

(H : Hd_Rec_Type) return Real; 
with function Conv_Tl«e 

(D^Float : Duration) return Duration; 
Header_^ : in out Hd_Rec_Type; 
Goesim^Tiae^Step : in out Duration; 
with function Seconds_Since_1957 

(7 : in Tiae) return Duration; 
with procedure Debug^lTrice (Output : String) ; 
with procedure Oebug_Cnd^Llne: 
type Direct_File_Type is liadted private; 
with procedure Oirect^Read 

(File : Oirect_File_Type) ; 
wich procedure Oirect^Get 

(File : in Direct^File^Type; 

Item : out Hd^RecJType) ; 
with fxinction I;nage_^Of__Ba3e_lO 

(Item : Duration) return String; 
with procedure Header_Data_Etror; 
procedure Chec)c_^Header_Generic 

(Siimiiation_Start_Time ; in T1 j»; 
Simuiation^Stop^Tijae : in Time; 
Sifliulation_Tima_Step : in Duration; 

History File : in out Direct_Fiie_^Type) ; 

The instantiation of ihts generic part is correspondingly 
complex: 

procedure Checlc^Header^Instance is new 

Check_Header_Generic 

( Ahst ract_Cai«ndar . Time, 

Abstract^Caiendar .Duration, 

□ebug^Cnable, 

Attitude_History_Typea .Header^Recotd, 
Set^Start, 

G«t_Start, 

Sot^Stop, 

Get_Stop, 

Utilities .Read, 

Get Att Hist_Out_Int , 


Con ve r t ed_T ime . 

Hist ory^Data . Heade r_Rec , 

History ^Data . Goe3ia_^Titne_Step, 

Timer .Seconds_Since_l 957, 

Error__Coiiector . Write, 

Er rorjCol lector . End^Line, 

Direct_Ki»ed_Io . Flle_Type , 

Direct^Mi*ed_Io . Read, 

Ge t^Froa^Buff er , 
laage J5f _Base_l 0 , 

Raise”aeaderJ>eta_Error) ; 

In contrast, a typical generic part on a unit which was 
daveloped and delivered as pan of the most recent completed 
project by the developers themselves is shown here: 

with Css^Types; 

generic 

NxunberjOf^Sensors : Mature i 

Css Types . Number jOf^Sensor 3 ; 
with fxinction Initialize_Sensor 

return Css_Types .Css_Oatabase_Typ« is <>; 

package Generic_Coar3e_Sun_Sensor is 

Note that by allowing the visibility of Css.Types, the 
generic part was simplified. Being unfamiliar with the 
(jomain, had we attempted to generalize Coarsa_Sun_Sensor by 
examining only the non-generic version of a corresponding 
component in another program we would not be able to tell 
whether the dependence on Css.Types was program-specific 
or domain-spectfia Here, however, the developer leads us to 
beHeve that Css.Types is domain-specilte white the number 
of sensors and sensor initiailzaiion is program specific. 


Guidelines 

The manual application of the pnndpies and techniques of 
generic transformation and extraction has revealed several 
interesting and intuitively reasonable guidelines relative to 
the creation and reuse of Ada software. In general, these 
guidelines appear to be applicable to programs of any size. 
However, the last guideline in the list, concerning program 
structure, was the most obvious when dealing with medium to 
large programs. 

• Avoid direct access into record components except in the 
same dectvative region as the record type declaration. 

Since there is no generic formal record type in Ada 
(without dynamic binding such a feature would be 
impractical) there is no straightforward way to replace 
record component access with a generic operation, instead, 
user-supplied access functions are needed to access the 
components and the type must be passed as a privaie type. 
This is unlike array types for which there are two generic 
formal types (constrained and un<x)nsirained). This supports 
the findings of others which assert that direct referencing of 
non-local record components adversely affects maintainability 
{61- 

• Minimize non-local access lo array components. 

Although not as difficult in general as removing dependence 
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on a record type* removing dependence on an ecray type can be 
cumbersome. 

• Keep direct access to data structures local to their 
declarations. 

This is a stronger conclusion than the previous two, and 
reinforces the philosophy ol using abstract data types in ail 
situations where a data type is available outside Its local 
dedarative region. Encajmlated types are far easier to 
separate as resources than globaMy declared types since the 
operations are localized and contained. 

• Avoid the use of literal values except as constant value 
assignments. 

Intormstion dependence is almost always associated with 
the use of a literal value in one unit of software that has some 
hidden relationship to a literal value in a different unit, if a 
unit is generalized and extracted for reuse but contains a 
literal value which Indcates a dependence on some assumption 
about its original context, that unit can fail in unpredictable 
ways when reused. Conventionai wisdom applies here, and it 
might be reasonable to relax the restriction to allow the use of 
0 and 1. However, expeiienoe with a considerable amount of 
software which makes the erroneota assumption that the first 
index of any string is 1 has shown that even this can lead to 
problems. 

• Avoid mingling resources with application spectlic contexts. 

Although the purpose of the transformations is to separate 
rmurces from application spedfie software regaidleas of the 
program structure, certain styles of programming result in 
programs which can be transformed more easily and 
completely. 8y staying conscious of the ultimate goal of 
separating reusable function from application declarations, 
whether or not the furKXfonaiity is initl^ programmed to be 
generic, programmers can simplify the eventual 
transformation of the code. 

• Keep interfaces abstract 

Protocol dependencies arise from the exportation of 
impiementaiion details that should not be present in the 
interface to a resource. Such an interface is vulnerable 
because it assumes a usage protocol which does not have to be 
followed by its users. The bad stack example illustrates what 
can happen when a resource interface requires the use of 
implementation details, however even resources with an 
appropriately abstract interface can export unwanted 
addition^ detail which can lead to protocol dependence. 

• Avoid direct reference to package Standard.F1oat 

Sven when used to define other floating point types, direct 
reference to Floai establishes an implemantation dependence 
that does not occur with anorrymoua floating point declarations. 
Especially dangerous is a direct reference to 
Slandard.Long_Float, Standard.Long^lmeger. etc., since they 
may not even compile on different Implementattons. Some 
care must also be taken with integer. Positive, and Natural. 


though in general they were not associated with as much 
dependence as Float Note that fixed point types in Ada are 
constructed as needed by the con^siier. Perhaps the same 
philosophy should have been adopted for Float and Integer. 
Reference to Character and Boolean is not a problem since they 
are the same on aU irnplememations. 

« Avoid the use of 'Address 

Even though it is not necessary to be in the scope of 
package System to use this attribute, it sets up a dependency 
on System J^ddress that makes the software non-portable. If 
this attribute is needed for some low-level programming than 
it should bo encapsulated and never be expo^ in the interface 
to that level. 

• Consider the imer-component dependence of a design 

By understanding how functionaify-equivalent programs 
can vary in their degree of Inter-component dependence, 
designers and developers can make decisions about how much 
depectdenca win be permitted in an evolving system, and how 
much effort wiH be applied to Pmit that dependence. For 
system devetopmenis which are expactad to yield reusable 
components directly, a decision can ba made to minimize 
depe n dencies from the outset For developments which are not 
able to make such an investmem in reusability, a decision can 
be made to allow certain kinds of dapendendes to occur. In 
particular, dependencies which are removable through 
subsequent transformation might be a llow ed while those that 
would be too difncult to remove later might be avoided. A 
particularly cumbersome type of deperKtenoe occurs when two 
library units reference each other, either directly or 
indirectiy. This should be avoided if at all possible. By 
making structural decisions explidtly, surprises can be 
avoided which might otherwise result in unwanted limitations 
of the developed software. 
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